Routine: PZGEHRD()  File: SRC\pzgehrd.f

 
 
# lines: 383
  # code: 383
  # comment: 0
  # blank:0
# Variables:63
# Callers:0
# Callings:3
# Words:156
# Keywords:85
 

 

..
     .. Array Arguments ..
     ..
  Purpose
  =======
  PZGEHRD reduces a complex general distributed matrix sub( A )
  to upper Hessenberg form H by an unitary similarity transformation:
  Q' * sub( A ) * Q = H, where
  sub( A ) = A(IA+N-1:IA+N-1,JA+N-1:JA+N-1).
  Notes
  =====
  Each global data object is described by an associated description
  vector.  This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix,
  and assume that its process grid has dimension p x q.
  LOCr( K ) denotes the number of elements of K that a process
  would receive if K were distributed over the p processes of its
  process column.
  Similarly, LOCc( K ) denotes the number of elements of K that a
  process would receive if K were distributed over the q processes of
  its process row.
  The values of LOCr() and LOCc() may be determined via a call to the
  ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
  N       (global input) INTEGER
          The number of rows and columns to be operated on, i.e. the
          order of the distributed submatrix sub( A ). N >= 0.
  ILO     (global input) INTEGER
  IHI     (global input) INTEGER
          It is assumed that sub( A ) is already upper triangular in
          rows IA:IA+ILO-2 and IA+IHI:IA+N-1 and columns JA:JA+ILO-2
          and JA+IHI:JA+N-1. See Further Details. If N > 0,
          1 <= ILO <= IHI <= N; otherwise set ILO = 1, IHI = N.
  A       (local input/local output) COMPLEX*16 pointer into the
          local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
          On entry, this array contains the local pieces of the N-by-N
          general distributed matrix sub( A ) to be reduced. On exit,
          the upper triangle and the first subdiagonal of sub( A ) are
          overwritten with the upper Hessenberg matrix H, and the ele-
          ments below the first subdiagonal, with the array TAU, repre-
          sent the unitary matrix Q as a product of elementary
          reflectors. See Further Details.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global and local input) INTEGER array of dimension DLEN_.
          The array descriptor for the distributed matrix A.
  TAU     (local output) COMPLEX*16 array, dimension LOCc(JA+N-2)
          The scalar factors of the elementary reflectors (see Further
          Details). Elements JA:JA+ILO-2 and JA+IHI:JA+N-2 of TAU are
          set to zero. TAU is tied to the distributed matrix A.
  WORK    (local workspace/local output) COMPLEX*16 array,
                                                    dimension (LWORK)
          On exit, WORK( 1 ) returns the minimal and optimal LWORK.
  LWORK   (local or global input) INTEGER
          The dimension of the array WORK.
          LWORK is local input and must be at least
          LWORK >= NB*NB + NB*MAX( IHIP+1, IHLP+INLQ )
          where NB = MB_A = NB_A, IROFFA = MOD( IA-1, NB ),
          ICOFFA = MOD( JA-1, NB ), IOFF = MOD( IA+ILO-2, NB ),
          IAROW = INDXG2P( IA, NB, MYROW, RSRC_A, NPROW ),
          IHIP = NUMROC( IHI+IROFFA, NB, MYROW, IAROW, NPROW ),
          ILROW = INDXG2P( IA+ILO-1, NB, MYROW, RSRC_A, NPROW ),
          IHLP = NUMROC( IHI-ILO+IOFF+1, NB, MYROW, ILROW, NPROW ),
          ILCOL = INDXG2P( JA+ILO-1, NB, MYCOL, CSRC_A, NPCOL ),
          INLQ = NUMROC( N-ILO+IOFF+1, NB, MYCOL, ILCOL, NPCOL ),
          INDXG2P and NUMROC are ScaLAPACK tool functions;
          MYROW, MYCOL, NPROW and NPCOL can be determined by calling
          the subroutine BLACS_GRIDINFO.
          If LWORK = -1, then LWORK is global input and a workspace
          query is assumed; the routine only calculates the minimum
          and optimal size for all work arrays. Each of these
          values is returned in the first entry of the corresponding
          work array, and no error message is issued by PXERBLA.
  INFO    (global output) INTEGER
          = 0:  successful exit
          < 0:  If the i-th argument is an array and the j-entry had
                an illegal value, then INFO = -(i*100+j), if the i-th
                argument is a scalar and had an illegal value, then
                INFO = -i.
  Further Details
  ===============
  The matrix Q is represented as a product of (ihi-ilo) elementary
  reflectors
     Q = H(ilo) H(ilo+1) . . . H(ihi-1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a complex scalar, and v is a complex vector with
  v(1:I) = 0, v(I+1) = 1 and v(IHI+1:N) = 0; v(I+2:IHI) is stored on
  exit in A(IA+ILO+I:IA+IHI-1,JA+ILO+I-2), and tau in TAU(JA+ILO+I-2).
  The contents of A(IA:IA+N-1,JA:JA+N-1) are illustrated by the follow-
  ing example, with N = 7, ILO = 2 and IHI = 6:
  on entry                         on exit
  ( a   a   a   a   a   a   a )    (  a   a   h   h   h   h   a )
  (     a   a   a   a   a   a )    (      a   h   h   h   h   a )
  (     a   a   a   a   a   a )    (      h   h   h   h   h   h )
  (     a   a   a   a   a   a )    (      v2  h   h   h   h   h )
  (     a   a   a   a   a   a )    (      v2  v3  h   h   h   h )
  (     a   a   a   a   a   a )    (      v2  v3  v4  h   h   h )
  (                         a )    (                          a )
  where a denotes an element of the original matrix sub( A ), H denotes
  a modified element of the upper Hessenberg matrix H, and vi denotes
  an element of the vector defining H(JA+ILO+I-2).
  Alignment requirements
  ======================
  The distributed submatrix sub( A ) must verify some alignment proper-
  ties, namely the following expression should be true:
  ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA )
  =====================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PZGEHRD( N , ILO , IHI , A , IA , JA , DESCA , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     May 25 , 2001
008  
009  *     .. Scalar Arguments ..
010        INTEGER IA , IHI , ILO , INFO , JA , LWORK , N
011        INTEGER BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ ,
012       $LLD_ , MB_ , M_ , NB_ , N_ , RSRC_
013        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
014       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
015       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
016        COMPLEX*16 ONE , ZERO
017        PARAMETER( ONE =( 1.0D + 0 , 0.0D + 0 ) ,
018       $ZERO =( 0.0D + 0 , 0.0D + 0 ) )
019  *     ..
020  *     .. Local Scalars ..
021        LOGICAL LQUERY
022        CHARACTER COLCTOP , ROWCTOP
023        INTEGER I , IACOL , IAROW , IB , ICOFFA , ICTXT , IHIP ,
024       $IHLP , IIA , IINFO , ILCOL , ILROW , IMCOL , INLQ ,
025       $IOFF , IPT , IPW , IPY , IROFFA , J , JJ , JJA , JY ,
026       $K , L , LWMIN , MYCOL , MYROW , NB , NPCOL , NPROW ,
027       $NQ
028        COMPLEX*16 EI
029  *     ..
030  *     .. Local Arrays ..
031        INTEGER DESCY( DLEN_ ) , IDUM1( 3 ) , IDUM2( 3 )
032  *     ..
033  *     .. External Subroutines ..
034        EXTERNAL BLACS_GRIDINFO , CHK1MAT , DESCSET , INFOG1L ,
035       $INFOG2L , PCHK1MAT , PB_TOPGET , PB_TOPSET ,
036       $PXERBLA , PZGEMM , PZGEHD2 , PZLAHRD , PZLARFB  
037  *     ..
038  *     .. External Functions ..
039        INTEGER INDXG2P , NUMROC
040        EXTERNAL INDXG2P , NUMROC
041  *     ..
042  *     .. Intrinsic Functions ..
043        INTRINSIC DBLE , DCMPLX , MAX , MIN , MOD
044  *     ..
045  *     .. Executable Statements ..
046  
047  *     Get grid parameters
048  
049        ICTXT = DESCA( CTXT_ )
050        CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
051  
052  *     Test the input parameters
053  
054        INFO = 0
055        IF( NPROW.EQ. - 1 ) THEN
056            INFO = - (700 + CTXT_)
057        ELSE
058            CALL CHK1MAT( N , 1 , N , 1 , IA , JA , DESCA , 7 , INFO )
059            IF( INFO.EQ.0 ) THEN
060                NB = DESCA( NB_ )
061                IROFFA = MOD( IA - 1 , NB )
062                ICOFFA = MOD( JA - 1 , NB )
063                CALL INFOG2L( IA , JA , DESCA , NPROW , NPCOL , MYROW , MYCOL ,
064       $        IIA , JJA , IAROW , IACOL )
065                IHIP = NUMROC( IHI + IROFFA , NB , MYROW , IAROW , NPROW )
066                IOFF = MOD( IA + ILO - 2 , NB )
067                ILROW = INDXG2P( IA + ILO - 1 , NB , MYROW , DESCA( RSRC_ ) ,
068       $        NPROW )
069                IHLP = NUMROC( IHI - ILO + IOFF + 1 , NB , MYROW , ILROW , NPROW )
070                ILCOL = INDXG2P( JA + ILO - 1 , NB , MYCOL , DESCA( CSRC_ ) ,
071       $        NPCOL )
072                INLQ = NUMROC( N - ILO + IOFF + 1 , NB , MYCOL , ILCOL , NPCOL )
073                LWMIN = NB*( NB + MAX( IHIP + 1 , IHLP + INLQ ) )
074  
075                WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
076                LQUERY =( LWORK.EQ. - 1 )
077                IF( ILO.LT.1 .OR. ILO.GT.MAX( 1 , N ) ) THEN
078                    INFO = - 2
079                ELSE IF( IHI.LT.MIN( ILO , N ) .OR. IHI.GT.N ) THEN
080                    INFO = - 3
081                ELSE IF( IROFFA.NE.ICOFFA .OR. IROFFA.NE.0 ) THEN
082                    INFO = - 6
083                ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
084                    INFO = - (700 + NB_)
085                ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
086                    INFO = - 10
087                END IF
088            END IF
089            IDUM1( 1 ) = ILO
090            IDUM2( 1 ) = 2
091            IDUM1( 2 ) = IHI
092            IDUM2( 2 ) = 3
093            IF( LWORK.EQ. - 1 ) THEN
094                IDUM1( 3 ) = - 1
095            ELSE
096                IDUM1( 3 ) = 1
097            END IF
098            IDUM2( 3 ) = 10
099            CALL PCHK1MAT( N , 1 , N , 1 , IA , JA , DESCA , 7 , 3 , IDUM1 , IDUM2 ,
100       $    INFO )
101        END IF
102  
103        IF( INFO.NE.0 ) THEN
104            CALL PXERBLA( ICTXT , 'PZGEHRD' , - INFO )
105            RETURN
106        ELSE IF( LQUERY ) THEN
107            RETURN
108        END IF
109  
110  *     Set elements JA : JA + ILO - 2 and JA + JHI - 1 : JA + N - 2 of TAU to zero.
111  
112        NQ = NUMROC( JA + N - 2 , NB , MYCOL , DESCA( CSRC_ ) , NPCOL )
113        CALL INFOG1L( JA + ILO - 2 , NB , NPCOL , MYCOL , DESCA( CSRC_ ) , JJ ,
114       $IMCOL )
115        DO 10 J = JJA , MIN( JJ , NQ )
116            TAU( J ) = ZERO
117     10 CONTINUE
118  
119        CALL INFOG1L( JA + IHI - 1 , NB , NPCOL , MYCOL , DESCA( CSRC_ ) , JJ ,
120       $IMCOL )
121        DO 20 J = JJ , NQ
122            TAU( J ) = ZERO
123     20 CONTINUE
124  
125  *     Quick return if possible
126  
127        IF( IHI - ILO.LE.0 )
128       $    RETURN
129  
130            CALL PB_TOPGET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
131            CALL PB_TOPGET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
132            CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , '1 - tree' )
133            CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , '1 - tree' )
134  
135            IPT = 1
136            IPY = IPT + NB * NB
137            IPW = IPY + IHIP * NB
138            CALL DESCSET( DESCY , IHI + IROFFA , NB , NB , NB , IAROW , ILCOL , ICTXT ,
139       $    MAX( 1 , IHIP ) )
140  
141            K = ILO
142            IB = NB - IOFF
143            JY = IOFF + 1
144  
145  *         Loop over remaining block of columns
146  
147            DO 30 L = 1 , IHI - ILO + IOFF - NB , NB
148                I = IA + K - 1
149                J = JA + K - 1
150  
151  *             Reduce columns j : j + ib - 1 to Hessenberg form , returning the
152  *             matrices V and T of the block reflector H = I - V*T*V'
153  *             which performs the reduction , and also the matrix Y = A*V*T
154  
155                CALL PZLAHRD ( IHI , K , IB , A , IA , J , DESCA , TAU , WORK( IPT ) ,
156       $        WORK( IPY ) , 1 , JY , DESCY , WORK( IPW ) )
157  
158  *             Apply the block reflector H to A(ia : ia + ihi - 1 , j + ib : ja + ihi - 1)
159  *             from the right , computing A := A - Y * V'.
160  *             V(i + ib , ib - 1) must be set to 1.
161  
162                CALL PZELSET2( EI , A , I + IB , J + IB - 1 , DESCA , ONE )
163                CALL PZGEMM( 'No transpose' , 'Conjugate transpose' , IHI ,
164       $        IHI - K - IB + 1 , IB , - ONE , WORK( IPY ) , 1 , JY , DESCY ,
165       $        A , I + IB , J , DESCA , ONE , A , IA , J + IB , DESCA )
166                CALL PZELSET( A , I + IB , J + IB - 1 , DESCA , EI )
167  
168  *             Apply the block reflector H to A(i + 1 : ia + ihi - 1 , j + ib : ja + n - 1) from
169  *             the left
170  
171                CALL PZLARFB ( 'Left' , 'Conjugate transpose' , 'Forward' ,
172       $        'Columnwise' , IHI - K , N - K - IB + 1 , IB , A , I + 1 , J ,
173       $        DESCA , WORK( IPT ) , A , I + 1 , J + IB , DESCA ,
174       $        WORK( IPY ) )
175  
176                K = K + IB
177                IB = NB
178                JY = 1
179                DESCY( CSRC_ ) = MOD( DESCY( CSRC_ ) + 1 , NPCOL )
180  
181     30     CONTINUE
182  
183  *         Use unblocked code to reduce the rest of the matrix
184  
185            CALL PZGEHD2 ( N , K , IHI , A , IA , JA , DESCA , TAU , WORK , LWORK ,
186       $    IINFO )
187  
188            CALL PB_TOPSET( ICTXT , 'Combine' , 'Columnwise' , COLCTOP )
189            CALL PB_TOPSET( ICTXT , 'Combine' , 'Rowwise' , ROWCTOP )
190  
191            WORK( 1 ) = DCMPLX( DBLE( LWMIN ) )
192  
193            RETURN
194  
195  *         End of PZGEHRD
196  
197        END