Routine: PZHENTRD()  File: SRC\pzhentrd.f

 
 
# lines: 587
  # code: 587
  # comment: 0
  # blank:0
# Variables:77
# Callers:1
# Callings:6
# Words:278
# Keywords:149
 

 

..
     .. Array Arguments ..
     ..
  Bugs
  ====
  Support for UPLO='U' is limited to calling the old, slow, PZHETRD
  code.
  Purpose
  =======
  PZHENTRD is a prototype version of PZHETRD which uses tailored
  codes (either the serial, ZHETRD, or the parallel code, PZHETTRD)
  when the workspace provided by the user is adequate.
  PZHENTRD reduces a complex Hermitian matrix sub( A ) to Hermitian
  tridiagonal form T by an unitary similarity transformation:
  Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
  Features
  ========
  PZHENTRD is faster than PZHETRD on almost all matrices,
  particularly small ones (i.e. N < 500 * sqrt(P) ), provided that
  enough workspace is available to use the tailored codes.
  The tailored codes provide performance that is essentially
  independent of the input data layout.
  The tailored codes place no restrictions on IA, JA, MB or NB.
  At present, IA, JA, MB and NB are restricted to those values allowed
  by PZHETRD to keep the interface simple.  These restrictions are
  documented below.  (Search for "restrictions".)
  Notes
  =====
  Each global data object is described by an associated description
  vector.  This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix,
  and assume that its process grid has dimension p x q.
  LOCr( K ) denotes the number of elements of K that a process
  would receive if K were distributed over the p processes of its
  process column.
  Similarly, LOCc( K ) denotes the number of elements of K that a
  process would receive if K were distributed over the q processes of
  its process row.
  The values of LOCr() and LOCc() may be determined via a call to the
  ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
  UPLO    (global input) CHARACTER
          Specifies whether the upper or lower triangular part of the
          Hermitian matrix sub( A ) is stored:
          = 'U':  Upper triangular
          = 'L':  Lower triangular
  N       (global input) INTEGER
          The number of rows and columns to be operated on, i.e. the
          order of the distributed submatrix sub( A ). N >= 0.
  A       (local input/local output) COMPLEX*16 pointer into the
          local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
          On entry, this array contains the local pieces of the
          Hermitian distributed matrix sub( A ).  If UPLO = 'U', the
          leading N-by-N upper triangular part of sub( A ) contains
          the upper triangular part of the matrix, and its strictly
          lower triangular part is not referenced. If UPLO = 'L', the
          leading N-by-N lower triangular part of sub( A ) contains the
          lower triangular part of the matrix, and its strictly upper
          triangular part is not referenced. On exit, if UPLO = 'U',
          the diagonal and first superdiagonal of sub( A ) are over-
          written by the corresponding elements of the tridiagonal
          matrix T, and the elements above the first superdiagonal,
          with the array TAU, represent the unitary matrix Q as a
          product of elementary reflectors; if UPLO = 'L', the diagonal
          and first subdiagonal of sub( A ) are overwritten by the
          corresponding elements of the tridiagonal matrix T, and the
          elements below the first subdiagonal, with the array TAU,
          represent the unitary matrix Q as a product of elementary
          reflectors. See Further Details.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global and local input) INTEGER array of dimension DLEN_.
          The array descriptor for the distributed matrix A.
  D       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          The diagonal elements of the tridiagonal matrix T:
          D(i) = A(i,i). D is tied to the distributed matrix A.
  E       (local output) DOUBLE PRECISION array, dimension LOCc(JA+N-1)
          if UPLO = 'U', LOCc(JA+N-2) otherwise. The off-diagonal
          elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
          UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
          distributed matrix A.
  TAU     (local output) COMPLEX*16, array, dimension
          LOCc(JA+N-1). This array contains the scalar factors TAU of
          the elementary reflectors. TAU is tied to the distributed
          matrix A.
  WORK    (local workspace/local output) COMPLEX*16 array,
                                                  dimension (LWORK)
          On exit, WORK( 1 ) returns the optimal LWORK.
  LWORK   (local or global input) INTEGER
          The dimension of the array WORK.
          LWORK is local input and must be at least
          LWORK >= MAX( NB * ( NP +1 ), 3 * NB )
          For optimal performance, greater workspace is needed, i.e.
            LWORK >= 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS + 4 ) * NPS
            ICTXT = DESCA( CTXT_ )
            ANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 )
            SQNPC = INT( SQRT( DBLE( NPROW * NPCOL ) ) )
            NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB )
            NUMROC is a ScaLAPACK tool functions;
            PJLAENV is a ScaLAPACK envionmental inquiry function
            MYROW, MYCOL, NPROW and NPCOL can be determined by calling
            the subroutine BLACS_GRIDINFO.
  RWORK   (local workspace/local output) COMPLEX*16 array,
                                                  dimension (LRWORK)
          On exit, RWORK( 1 ) returns the optimal LRWORK.
  LRWORK  (local or global input) INTEGER
          The dimension of the array RWORK.
          LRWORK is local input and must be at least
          LRWORK >= 1
          For optimal performance, greater workspace is needed, i.e.
            LRWORK >= MAX( 2 * N )
  INFO    (global output) INTEGER
          = 0:  successful exit
          < 0:  If the i-th argument is an array and the j-entry had
                an illegal value, then INFO = -(i*100+j), if the i-th
                argument is a scalar and had an illegal value, then
                INFO = -i.
  Further Details
  ===============
  If UPLO = 'U', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(n-1) . . . H(2) H(1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
  A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
  If UPLO = 'L', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(1) H(2) . . . H(n-1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
  A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
  The contents of sub( A ) on exit are illustrated by the following
  examples with n = 5:
  if UPLO = 'U':                       if UPLO = 'L':
    (  d   e   v2  v3  v4 )              (  d                  )
    (      d   e   v3  v4 )              (  e   d              )
    (          d   e   v4 )              (  v1  e   d          )
    (              d   e  )              (  v1  v2  e   d      )
    (                  d  )              (  v1  v2  v3  e   d  )
  where d and e denote diagonal and off-diagonal elements of T, and vi
  denotes an element of the vector defining H(i).
  Alignment requirements
  ======================
  The distributed submatrix sub( A ) must verify some alignment proper-
  ties, namely the following expression should be true:
  ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA .AND. IROFFA.EQ.0 ) with
  IROFFA = MOD( IA-1, MB_A ) and ICOFFA = MOD( JA-1, NB_A ).
  =====================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PZHENTRD( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , RWORK , LRWORK , INFO )
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     October 15 , 1999
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LRWORK , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , DLEN_ , DTYPE_ , CTXT_ , M_ , N_ ,
013       $MB_ , NB_ , RSRC_ , CSRC_ , LLD_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        DOUBLE PRECISION ONE
018        PARAMETER( ONE = 1.0D + 0 )
019        COMPLEX*16 CONE
020        PARAMETER( CONE =( 1.0D + 0 , 0.0D + 0 ) )
021  *     ..
022  *     .. Local Scalars ..
023        LOGICAL LQUERY , UPPER
024        CHARACTER COLCTOP , ROWCTOP
025        INTEGER ANB , CTXTB , I , IACOL , IAROW , ICOFFA , ICTXT ,
026       $IINFO , INDB , INDRD , INDRE , INDTAU , INDW , IPW ,
027       $IROFFA , J , JB , JX , K , KK , LLRWORK , LLWORK ,
028       $LRWMIN , LWMIN , MINSZ , MYCOL , MYCOLB , MYROW ,
029       $MYROWB , NB , NP , NPCOL , NPCOLB , NPROW , NPROWB ,
030       $NPS , NQ , ONEPMIN , ONEPRMIN , SQNPC , TTLRWMIN ,
031       $TTLWMIN
032  *     ..
033  *     .. Local Arrays ..
034        INTEGER DESCB( DLEN_ ) , DESCW( DLEN_ ) , IDUM1( 3 ) ,
035       $IDUM2( 3 )
036  *     ..
037  *     .. External Subroutines ..
038        EXTERNAL BLACS_GET , BLACS_GRIDEXIT , BLACS_GRIDINFO ,
039       $BLACS_GRIDINIT , CHK1MAT , DESCSET , IGAMN2D ,
040       $PCHK1MAT , PDLAMR1D , PB_TOPGET , PB_TOPSET ,
041       $PXERBLA , PZELSET , PZHER2K , PZHETD2 , PZHETTRD ,
042       $PZLAMR1D , PZLATRD , PZTRMR2D , ZHETRD
043  *     ..
044  *     .. External Functions ..
045        LOGICAL LSAME
046        INTEGER INDXG2L , INDXG2P , NUMROC , PJLAENV
047        EXTERNAL LSAME , INDXG2L , INDXG2P , NUMROC , PJLAENV
048  *     ..
049  *     .. Intrinsic Functions ..
050        INTRINSIC DBLE , DCMPLX , ICHAR , INT , MAX , MIN , MOD , SQRT
051  *     ..
052  *     .. Executable Statements ..
053  
054  *     This is just to keep ftnchek and toolpack / 1 happy
055        IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_*
056       $    RSRC_.LT.0 )RETURN
057  *         Get grid parameters
058  
059            ICTXT = DESCA( CTXT_ )
060            CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
061  
062  *         Test the input parameters
063  
064            INFO = 0
065            IF( NPROW.EQ. - 1 ) THEN
066                INFO = - ( 600 + CTXT_ )
067            ELSE
068                CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
069                UPPER = LSAME( UPLO , 'U' )
070                IF( INFO.EQ.0 ) THEN
071                    NB = DESCA( NB_ )
072                    IROFFA = MOD( IA - 1 , DESCA( MB_ ) )
073                    ICOFFA = MOD( JA - 1 , DESCA( NB_ ) )
074                    IAROW = INDXG2P( IA , NB , MYROW , DESCA( RSRC_ ) , NPROW )
075                    IACOL = INDXG2P( JA , NB , MYCOL , DESCA( CSRC_ ) , NPCOL )
076                    NP = NUMROC( N , NB , MYROW , IAROW , NPROW )
077                    NQ = MAX( 1 , NUMROC( N + JA - 1 , NB , MYCOL , DESCA( CSRC_ ) ,
078       $            NPCOL ) )
079                    LWMIN = MAX(( NP + 1 )*NB , 3*NB )
080                    ANB = PJLAENV( ICTXT , 3 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
081                    MINSZ = PJLAENV( ICTXT , 5 , 'PZHETTRD' , 'L' , 0 , 0 , 0 , 0 )
082                    SQNPC = INT( SQRT( DBLE( NPROW*NPCOL ) ) )
083                    NPS = MAX( NUMROC( N , 1 , 0 , 0 , SQNPC ) , 2*ANB )
084                    TTLWMIN = 2*( ANB + 1 )*( 4*NPS + 2 ) + ( NPS + 2 )*NPS
085                    LRWMIN = 1
086                    TTLRWMIN = 2*NPS
087  
088                    WORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) )
089                    RWORK( 1 ) = DBLE( TTLRWMIN )
090                    LQUERY =( LWORK.EQ. - 1 .OR. LRWORK.EQ. - 1 )
091                    IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) THEN
092                        INFO = - 1
093  
094  *                     The following two restrictions are not necessary provided
095  *                     that either of the tailored codes are used.
096  
097                    ELSE IF( IROFFA.NE.ICOFFA .OR. ICOFFA.NE.0 ) THEN
098                        INFO = - 5
099                    ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
100                        INFO = - ( 600 + NB_ )
101                    ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
102                        INFO = - 11
103                    ELSE IF( LRWORK.LT.LRWMIN .AND. .NOT.LQUERY ) THEN
104                        INFO = - 13
105                    END IF
106                END IF
107                IF( UPPER ) THEN
108                    IDUM1( 1 ) = ICHAR( 'U' )
109                ELSE
110                    IDUM1( 1 ) = ICHAR( 'L' )
111                END IF
112                IDUM2( 1 ) = 1
113                IF( LWORK.EQ. - 1 ) THEN
114                    IDUM1( 2 ) = - 1
115                ELSE
116                    IDUM1( 2 ) = 1
117                END IF
118                IDUM2( 2 ) = 11
119                IF( LRWORK.EQ. - 1 ) THEN
120                    IDUM1( 3 ) = - 1
121                ELSE
122                    IDUM1( 3 ) = 1
123                END IF
124                IDUM2( 3 ) = 13
125                CALL PCHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , 3 , IDUM1 , IDUM2 ,
126       $        INFO )
127            END IF
128  
129            IF( INFO.NE.0 ) THEN
130                CALL PXERBLA( ICTXT , 'PZHENTRD' , - INFO )
131                RETURN
132            ELSE IF( LQUERY ) THEN
133                RETURN
134            END IF
135  
136  *         Quick return if possible
137  
138            IF( N.EQ.0 )
139       $        RETURN
140  
141                ONEPMIN = N*N + 3*N + 1
142                LLWORK = LWORK
143                CALL IGAMN2D( ICTXT , 'A' , ' ' , 1 , 1 , LLWORK , 1 , 1 , - 1 , - 1 , - 1 ,
144       $        - 1 )
145  
146                ONEPRMIN = 2*N
147                LLRWORK = LRWORK
148                CALL IGAMN2D( ICTXT , 'A' , ' ' , 1 , 1 , LLRWORK , 1 , 1 , - 1 , - 1 , - 1 ,
149       $        - 1 )
150  
151  *             Use the serial , LAPACK , code : ZTRD on small matrices if we
152  *             we have enough space.
153  
154                NPROWB = 0
155                IF(( N.LT.MINSZ .OR. SQNPC.EQ.1 ) .AND. LLWORK.GE.ONEPMIN .AND.
156       $        LLRWORK.GE.ONEPRMIN .AND. .NOT.UPPER ) THEN
157                NPROWB = 1
158                NPS = N
159            ELSE
160                IF( LLWORK.GE.TTLWMIN .AND. LLRWORK.GE.TTLRWMIN .AND. .NOT.
161       $            UPPER ) THEN
162                    NPROWB = SQNPC
163                END IF
164            END IF
165  
166            IF( NPROWB.GE.1 ) THEN
167                NPCOLB = NPROWB
168                SQNPC = NPROWB
169                INDB = 1
170                INDRD = 1
171                INDRE = INDRD + NPS
172                INDTAU = INDB + NPS*NPS
173                INDW = INDTAU + NPS
174                LLWORK = LLWORK - INDW + 1
175  
176                CALL BLACS_GET( ICTXT , 10 , CTXTB )
177                CALL BLACS_GRIDINIT( CTXTB , 'Row major' , SQNPC , SQNPC )
178                CALL BLACS_GRIDINFO( CTXTB , NPROWB , NPCOLB , MYROWB , MYCOLB )
179                CALL DESCSET( DESCB , N , N , 1 , 1 , 0 , 0 , CTXTB , NPS )
180  
181                CALL PZTRMR2D( UPLO , 'N' , N , N , A , IA , JA , DESCA , WORK( INDB ) ,
182       $        1 , 1 , DESCB , ICTXT )
183  
184  *             Only those processors in context CTXTB are needed for a while
185  
186                IF( NPROWB.GT.0 ) THEN
187  
188                    IF( NPROWB.EQ.1 ) THEN
189                        CALL ZHETRD( UPLO , N , WORK( INDB ) , NPS , RWORK( INDRD ) ,
190       $                RWORK( INDRE ) , WORK( INDTAU ) ,
191       $                WORK( INDW ) , LLWORK , INFO )
192                    ELSE
193  
194                        CALL PZHETTRD ( 'L' , N , WORK( INDB ) , 1 , 1 , DESCB ,
195       $                RWORK( INDRD ) , RWORK( INDRE ) ,
196       $                WORK( INDTAU ) , WORK( INDW ) , LLWORK ,
197       $                INFO )
198  
199                    END IF
200                END IF
201  
202  *             All processors participate in moving the data back to the
203  *             way that PZHENTRD expects it.
204  
205                CALL PDLAMR1D ( N - 1 , RWORK( INDRE ) , 1 , 1 , DESCB , E , 1 , JA ,
206       $        DESCA )
207  
208                CALL PDLAMR1D ( N , RWORK( INDRD ) , 1 , 1 , DESCB , D , 1 , JA ,
209       $        DESCA )
210  
211                CALL PZLAMR1D ( N , WORK( INDTAU ) , 1 , 1 , DESCB , TAU , 1 , JA ,
212       $        DESCA )
213  
214                CALL PZTRMR2D( UPLO , 'N' , N , N , WORK( INDB ) , 1 , 1 , DESCB , A ,
215       $        IA , JA , DESCA , ICTXT )
216  
217                IF( MYROWB.GE.0 )
218       $            CALL BLACS_GRIDEXIT( CTXTB )
219  
220                ELSE
221  
222                    CALL PB_TOPGET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
223                    CALL PB_TOPGET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
224                    CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , '1 - tree' )
225                    CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , '1 - tree' )
226  
227                    IPW = NP*NB + 1
228  
229                    IF( UPPER ) THEN
230  
231  *                     Reduce the upper triangle of sub( A ).
232  
233                        KK = MOD( JA + N - 1 , NB )
234                        IF( KK.EQ.0 )
235       $                    KK = NB
236                            CALL DESCSET( DESCW , N , NB , NB , NB , IAROW ,
237       $                    INDXG2P( JA + N - KK , NB , MYCOL , DESCA( CSRC_ ) ,
238       $                    NPCOL ) , ICTXT , MAX( 1 , NP ) )
239  
240                            DO 10 K = N - KK + 1 , NB + 1 , - NB
241                                JB = MIN( N - K + 1 , NB )
242                                I = IA + K - 1
243                                J = JA + K - 1
244  
245  *                             Reduce columns I : I + NB - 1 to tridiagonal form and form
246  *                             the matrix W which is needed to update the unreduced part of
247  *                             the matrix
248  
249                                CALL PZLATRD ( UPLO , K + JB - 1 , JB , A , IA , JA , DESCA , D , E ,
250       $                        TAU , WORK , 1 , 1 , DESCW , WORK( IPW ) )
251  
252  *                             Update the unreduced submatrix A(IA : I - 1 , JA : J - 1) , using an
253  *                             update of the form :
254  *                             A(IA : I - 1 , JA : J - 1) := A(IA : I - 1 , JA : J - 1) - V*W' - W*V'
255  
256                                CALL PZHER2K( UPLO , 'No transpose' , K - 1 , JB , - CONE , A ,
257       $                        IA , J , DESCA , WORK , 1 , 1 , DESCW , ONE , A ,
258       $                        IA , JA , DESCA )
259  
260  *                             Copy last superdiagonal element back into sub( A )
261  
262                                JX = MIN( INDXG2L( J , NB , 0 , IACOL , NPCOL ) , NQ )
263                                CALL PZELSET( A , I - 1 , J , DESCA , DCMPLX( E( JX ) ) )
264  
265                                DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + NPCOL - 1 , NPCOL )
266  
267     10                     CONTINUE
268  
269  *                         Use unblocked code to reduce the last or only block
270  
271                            CALL PZHETD2 ( UPLO , MIN( N , NB ) , A , IA , JA , DESCA , D , E ,
272       $                    TAU , WORK , LWORK , IINFO )
273  
274                        ELSE
275  
276  *                         Reduce the lower triangle of sub( A )
277  
278                            KK = MOD( JA + N - 1 , NB )
279                            IF( KK.EQ.0 )
280       $                        KK = NB
281                                CALL DESCSET( DESCW , N , NB , NB , NB , IAROW , IACOL , ICTXT ,
282       $                        MAX( 1 , NP ) )
283  
284                                DO 20 K = 1 , N - NB , NB
285                                    I = IA + K - 1
286                                    J = JA + K - 1
287  
288  *                                 Reduce columns I : I + NB - 1 to tridiagonal form and form
289  *                                 the matrix W which is needed to update the unreduced part
290  *                                 of the matrix
291  
292                                    CALL PZLATRD ( UPLO , N - K + 1 , NB , A , I , J , DESCA , D , E , TAU ,
293       $                            WORK , K , 1 , DESCW , WORK( IPW ) )
294  
295  *                                 Update the unreduced submatrix A(I + NB : IA + N - 1 , I + NB : IA + N - 1) ,
296  *                                 using an update of the form : A(I + NB : IA + N - 1 , I + NB : IA + N - 1) :=
297  *                                 A(I + NB : IA + N - 1 , I + NB : IA + N - 1) - V*W' - W*V'
298  
299                                    CALL PZHER2K( UPLO , 'No transpose' , N - K - NB + 1 , NB , - CONE ,
300       $                            A , I + NB , J , DESCA , WORK , K + NB , 1 , DESCW ,
301       $                            ONE , A , I + NB , J + NB , DESCA )
302  
303  *                                 Copy last subdiagonal element back into sub( A )
304  
305                                    JX = MIN( INDXG2L( J + NB - 1 , NB , 0 , IACOL , NPCOL ) , NQ )
306                                    CALL PZELSET( A , I + NB , J + NB - 1 , DESCA , DCMPLX( E( JX ) ) )
307  
308                                    DESCW( CSRC_ ) = MOD( DESCW( CSRC_ ) + 1 , NPCOL )
309  
310     20                         CONTINUE
311  
312  *                             Use unblocked code to reduce the last or only block
313  
314                                CALL PZHETD2 ( UPLO , KK , A , IA + K - 1 , JA + K - 1 , DESCA , D , E , TAU ,
315       $                        WORK , LWORK , IINFO )
316                            END IF
317  
318                            CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
319                            CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
320  
321                        END IF
322  
323                        WORK( 1 ) = DCMPLX( DBLE( TTLWMIN ) )
324                        RWORK( 1 ) = DBLE( TTLRWMIN )
325  
326                        RETURN
327  
328  *                     End of PZHENTRD
329  
330                    END