Routine: PSSYTD2()  File: SRC\pssytd2.f

 
 
# lines: 464
  # code: 464
  # comment: 0
  # blank:0
# Variables:48
# Callers:2
# Callings:0
# Words:160
# Keywords:107
 

 

..
     .. Array Arguments ..
     ..
  Purpose
  =======
  PSSYTD2 reduces a real symmetric matrix sub( A ) to symmetric
  tridiagonal form T by an orthogonal similarity transformation:
  Q' * sub( A ) * Q = T, where sub( A ) = A(IA:IA+N-1,JA:JA+N-1).
  Notes
  =====
  Each global data object is described by an associated description
  vector.  This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix,
  and assume that its process grid has dimension p x q.
  LOCr( K ) denotes the number of elements of K that a process
  would receive if K were distributed over the p processes of its
  process column.
  Similarly, LOCc( K ) denotes the number of elements of K that a
  process would receive if K were distributed over the q processes of
  its process row.
  The values of LOCr() and LOCc() may be determined via a call to the
  ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
  UPLO    (global input) CHARACTER
          Specifies whether the upper or lower triangular part of the
          symmetric matrix sub( A ) is stored:
          = 'U':  Upper triangular
          = 'L':  Lower triangular
  N       (global input) INTEGER
          The number of rows and columns to be operated on, i.e. the
          order of the distributed submatrix sub( A ). N >= 0.
  A       (local input/local output) REAL pointer into the
          local memory to an array of dimension (LLD_A,LOCc(JA+N-1)).
          On entry, this array contains the local pieces of the
          symmetric distributed matrix sub( A ).  If UPLO = 'U', the
          leading N-by-N upper triangular part of sub( A ) contains
          the upper triangular part of the matrix, and its strictly
          lower triangular part is not referenced. If UPLO = 'L', the
          leading N-by-N lower triangular part of sub( A ) contains the
          lower triangular part of the matrix, and its strictly upper
          triangular part is not referenced. On exit, if UPLO = 'U',
          the diagonal and first superdiagonal of sub( A ) are over-
          written by the corresponding elements of the tridiagonal
          matrix T, and the elements above the first superdiagonal,
          with the array TAU, represent the orthogonal matrix Q as a
          product of elementary reflectors; if UPLO = 'L', the diagonal
          and first subdiagonal of sub( A ) are overwritten by the
          corresponding elements of the tridiagonal matrix T, and the
          elements below the first subdiagonal, with the array TAU,
          represent the orthogonal matrix Q as a product of elementary
          reflectors. See Further Details.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global and local input) INTEGER array of dimension DLEN_.
          The array descriptor for the distributed matrix A.
  D       (local output) REAL array, dimension LOCc(JA+N-1)
          The diagonal elements of the tridiagonal matrix T:
          D(i) = A(i,i). D is tied to the distributed matrix A.
  E       (local output) REAL array, dimension LOCc(JA+N-1)
          if UPLO = 'U', LOCc(JA+N-2) otherwise. The off-diagonal
          elements of the tridiagonal matrix T: E(i) = A(i,i+1) if
          UPLO = 'U', E(i) = A(i+1,i) if UPLO = 'L'. E is tied to the
          distributed matrix A.
  TAU     (local output) REAL, array, dimension
          LOCc(JA+N-1). This array contains the scalar factors TAU of
          the elementary reflectors. TAU is tied to the distributed
          matrix A.
  WORK    (local workspace/local output) REAL array,
                                                    dimension (LWORK)
          On exit, WORK( 1 ) returns the minimal and optimal LWORK.
  LWORK   (local or global input) INTEGER
          The dimension of the array WORK.
          LWORK is local input and must be at least
          LWORK >= 3*N.
          If LWORK = -1, then LWORK is global input and a workspace
          query is assumed; the routine only calculates the minimum
          and optimal size for all work arrays. Each of these
          values is returned in the first entry of the corresponding
          work array, and no error message is issued by PXERBLA.
  INFO    (local output) INTEGER
          = 0:  successful exit
          < 0:  If the i-th argument is an array and the j-entry had
                an illegal value, then INFO = -(i*100+j), if the i-th
                argument is a scalar and had an illegal value, then
                INFO = -i.
  Further Details
  ===============
  If UPLO = 'U', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(n-1) . . . H(2) H(1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a real scalar, and v is a real vector with
  v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
  A(ia:ia+i-2,ja+i), and tau in TAU(ja+i-1).
  If UPLO = 'L', the matrix Q is represented as a product of elementary
  reflectors
     Q = H(1) H(2) . . . H(n-1).
  Each H(i) has the form
     H(i) = I - tau * v * v'
  where tau is a real scalar, and v is a real vector with
  v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
  A(ia+i+1:ia+n-1,ja+i-1), and tau in TAU(ja+i-1).
  The contents of sub( A ) on exit are illustrated by the following
  examples with n = 5:
  if UPLO = 'U':                       if UPLO = 'L':
    (  d   e   v2  v3  v4 )              (  d                  )
    (      d   e   v3  v4 )              (  e   d              )
    (          d   e   v4 )              (  v1  e   d          )
    (              d   e  )              (  v1  v2  e   d      )
    (                  d  )              (  v1  v2  v3  e   d  )
  where d and e denote diagonal and off-diagonal elements of T, and vi
  denotes an element of the vector defining H(i).
  Alignment requirements
  ======================
  The distributed submatrix sub( A ) must verify some alignment proper-
  ties, namely the following expression should be true:
  ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA ) with
  IROFFA = MOD( IA-1, MB_A ) and ICOFFA = MOD( JA-1, NB_A ).
  =====================================================================
     .. Parameters ..

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PSSYTD2( UPLO , N , A , IA , JA , DESCA , D , E , TAU , WORK ,
002       $LWORK , INFO )
003  
004  *     -- ScaLAPACK auxiliary routine(version 1.7) --
005  *     University of Tennessee , Knoxville , Oak Ridge National Laboratory ,
006  *     and University of California , Berkeley.
007  *     May 1 , 1997
008  
009  *     .. Scalar Arguments ..
010        CHARACTER UPLO
011        INTEGER IA , INFO , JA , LWORK , N
012        INTEGER BLOCK_CYCLIC_2D , CSRC_ , CTXT_ , DLEN_ , DTYPE_ ,
013       $LLD_ , MB_ , M_ , NB_ , N_ , RSRC_
014        PARAMETER( BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 ,
015       $CTXT_ = 2 , M_ = 3 , N_ = 4 , MB_ = 5 , NB_ = 6 ,
016       $RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 )
017        REAL HALF , ONE , ZERO
018        PARAMETER( HALF = 0.5E + 0 , ONE = 1.0E + 0 , ZERO = 0.0E + 0 )
019  *     ..
020  *     .. Local Scalars ..
021        LOGICAL LQUERY , UPPER
022        INTEGER IACOL , IAROW , ICOFFA , ICTXT , II , IK , IROFFA , J ,
023       $JJ , JK , JN , LDA , LWMIN , MYCOL , MYROW , NPCOL ,
024       $NPROW
025        REAL ALPHA , TAUI
026  *     ..
027  *     .. External Subroutines ..
028        EXTERNAL BLACS_ABORT , BLACS_GRIDINFO , CHK1MAT , INFOG2L ,
029       $PXERBLA , SAXPY , SGEBR2D , SGEBS2D ,
030       $SLARFG , SSYMV , SSYR2
031  *     ..
032  *     .. External Functions ..
033        LOGICAL LSAME
034        REAL SDOT
035        EXTERNAL LSAME , SDOT
036  *     ..
037  *     .. Intrinsic Functions ..
038        INTRINSIC REAL
039  *     ..
040  *     .. Executable Statements ..
041  
042  *     Get grid parameters
043  
044        ICTXT = DESCA( CTXT_ )
045        CALL BLACS_GRIDINFO( ICTXT , NPROW , NPCOL , MYROW , MYCOL )
046  
047  *     Test the input parameters
048  
049        INFO = 0
050        IF( NPROW.EQ. - 1 ) THEN
051            INFO = - (600 + CTXT_)
052        ELSE
053            UPPER = LSAME( UPLO , 'U' )
054            CALL CHK1MAT( N , 2 , N , 2 , IA , JA , DESCA , 6 , INFO )
055            LWMIN = 3 * N
056  
057            WORK( 1 ) = REAL( LWMIN )
058            LQUERY =( LWORK.EQ. - 1 )
059            IF( INFO.EQ.0 ) THEN
060                IROFFA = MOD( IA - 1 , DESCA( MB_ ) )
061                ICOFFA = MOD( JA - 1 , DESCA( NB_ ) )
062                IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO , 'L' ) ) THEN
063                    INFO = - 1
064                ELSE IF( IROFFA.NE.ICOFFA ) THEN
065                    INFO = - 5
066                ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN
067                    INFO = - (600 + NB_)
068                ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN
069                    INFO = - 11
070                END IF
071            END IF
072        END IF
073  
074        IF( INFO.NE.0 ) THEN
075            CALL PXERBLA( ICTXT , 'PSSYTD2' , - INFO )
076            CALL BLACS_ABORT( ICTXT , 1 )
077            RETURN
078        ELSE IF( LQUERY ) THEN
079            RETURN
080        END IF
081  
082  *     Quick return if possible
083  
084        IF( N.LE.0 )
085       $    RETURN
086  
087  *         Compute local information
088  
089            LDA = DESCA( LLD_ )
090            CALL INFOG2L( IA , JA , DESCA , NPROW , NPCOL , MYROW , MYCOL , II , JJ ,
091       $    IAROW , IACOL )
092  
093            IF( UPPER ) THEN
094  
095  *             Process(IAROW , IACOL) owns block to be reduced
096  
097                IF( MYCOL.EQ.IACOL ) THEN
098                    IF( MYROW.EQ.IAROW ) THEN
099  
100  *                     Reduce the upper triangle of sub( A )
101  
102                        DO 10 J = N - 1 , 1 , - 1
103                            IK = II + J - 1
104                            JK = JJ + J - 1
105  
106  *                         Generate elementary reflector H(i) = I - tau * v * v'
107  *                         to annihilate A(IA : IA + J - 1 , JA : JA + J - 1)
108  
109                            CALL SLARFG( J , A( IK + JK*LDA ) , A( II + JK*LDA ) , 1 ,
110       $                    TAUI )
111                            E( JK + 1 ) = A( IK + JK*LDA )
112  
113                            IF( TAUI.NE.ZERO ) THEN
114  
115  *                             Apply H(i) from both sides to
116  *                             A(IA : IA + J - 1 , JA : JA + J - 1)
117  
118                                A( IK + JK*LDA ) = ONE
119  
120  *                             Compute x := tau * A * v storing x in TAU(1 : i)
121  
122                                CALL SSYMV( UPLO , J , TAUI , A( II + (JJ - 1)*LDA ) ,
123       $                        LDA , A( II + JK*LDA ) , 1 , ZERO ,
124       $                        TAU( JJ ) , 1 )
125  
126  *                             Compute w := x - 1 / 2 * tau * (x'*v) * v
127  
128                                ALPHA = - HALF*TAUI*SDOT( J , TAU( JJ ) , 1 ,
129       $                        A( II + JK*LDA ) , 1 )
130                                CALL SAXPY( J , ALPHA , A( II + JK*LDA ) , 1 ,
131       $                        TAU( JJ ) , 1 )
132  
133  *                             Apply the transformation as a rank - 2 update :
134  *                             A := A - v * w' - w * v'
135  
136                                CALL SSYR2( UPLO , J , - ONE , A( II + JK*LDA ) , 1 ,
137       $                        TAU( JJ ) , 1 , A( II + (JJ - 1)*LDA ) ,
138       $                        LDA )
139                                A( IK + JK*LDA ) = E( JK + 1 )
140                            END IF
141  
142  *                         Copy D , E , TAU to broadcast them columnwise.
143  
144                            D( JK + 1 ) = A( IK + 1 + JK*LDA )
145                            WORK( J + 1 ) = D( JK + 1 )
146                            WORK( N + J + 1 ) = E( JK + 1 )
147                            TAU( JK + 1 ) = TAUI
148                            WORK( 2*N + J + 1 ) = TAU( JK + 1 )
149  
150     10                 CONTINUE
151                        D( JJ ) = A( II + (JJ - 1)*LDA )
152                        WORK( 1 ) = D( JJ )
153                        WORK( N + 1 ) = ZERO
154                        WORK( 2*N + 1 ) = ZERO
155  
156                        CALL SGEBS2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N , WORK , 1 )
157  
158                    ELSE
159                        CALL SGEBR2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N , WORK , 1 ,
160       $                IAROW , IACOL )
161                        DO 20 J = 2 , N
162                            JN = JJ + J - 1
163                            D( JN ) = WORK( J )
164                            E( JN ) = WORK( N + J )
165                            TAU( JN ) = WORK( 2*N + J )
166     20                 CONTINUE
167                        D( JJ ) = WORK( 1 )
168                    END IF
169                END IF
170  
171            ELSE
172  
173  *             Process(IAROW , IACOL) owns block to be factorized
174  
175                IF( MYCOL.EQ.IACOL ) THEN
176                    IF( MYROW.EQ.IAROW ) THEN
177  
178  *                     Reduce the lower triangle of sub( A )
179  
180                        DO 30 J = 1 , N - 1
181                            IK = II + J - 1
182                            JK = JJ + J - 1
183  
184  *                         Generate elementary reflector H(i) = I - tau * v * v'
185  *                         to annihilate A(IA + J - JA + 2 : IA + N - 1 , JA + J - 1)
186  
187                            CALL SLARFG( N - J , A( IK + 1 + (JK - 1)*LDA ) ,
188       $                    A( IK + 2 + (JK - 1)*LDA ) , 1 , TAUI )
189                            E( JK ) = A( IK + 1 + (JK - 1)*LDA )
190  
191                            IF( TAUI.NE.ZERO ) THEN
192  
193  *                             Apply H(i) from both sides to
194  *                             A(IA + J - JA + 1 : IA + N - 1 , JA + J + 1 : JA + N - 1)
195  
196                                A( IK + 1 + (JK - 1)*LDA ) = ONE
197  
198  *                             Compute x := tau * A * v storing y in TAU(i : n - 1)
199  
200                                CALL SSYMV( UPLO , N - J , TAUI , A( IK + 1 + JK*LDA ) ,
201       $                        LDA , A( IK + 1 + (JK - 1)*LDA ) , 1 ,
202       $                        ZERO , TAU( JK ) , 1 )
203  
204  *                             Compute w := x - 1 / 2 * tau * (x'*v) * v
205  
206                                ALPHA = - HALF*TAUI*SDOT( N - J , TAU( JK ) , 1 ,
207       $                        A( IK + 1 + (JK - 1)*LDA ) , 1 )
208                                CALL SAXPY( N - J , ALPHA , A( IK + 1 + (JK - 1)*LDA ) ,
209       $                        1 , TAU( JK ) , 1 )
210  
211  *                             Apply the transformation as a rank - 2 update :
212  *                             A := A - v * w' - w * v'
213  
214                                CALL SSYR2( UPLO , N - J , - ONE ,
215       $                        A( IK + 1 + (JK - 1)*LDA ) , 1 ,
216       $                        TAU( JK ) , 1 , A( IK + 1 + JK*LDA ) ,
217       $                        LDA )
218                                A( IK + 1 + (JK - 1)*LDA ) = E( JK )
219                            END IF
220  
221  *                         Copy D(JK) , E(JK) , TAU(JK) to broadcast them
222  *                         columnwise.
223  
224                            D( JK ) = A( IK + (JK - 1)*LDA )
225                            WORK( J ) = D( JK )
226                            WORK( N + J ) = E( JK )
227                            TAU( JK ) = TAUI
228                            WORK( 2*N + J ) = TAU( JK )
229     30                 CONTINUE
230                        JN = JJ + N - 1
231                        D( JN ) = A( II + N - 1 + (JN - 1)*LDA )
232                        WORK( N ) = D( JN )
233                        TAU( JN ) = ZERO
234                        WORK( 2*N ) = ZERO
235  
236                        CALL SGEBS2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N - 1 , WORK ,
237       $                1 )
238  
239                    ELSE
240                        CALL SGEBR2D( ICTXT , 'Columnwise' , ' ' , 1 , 3*N - 1 , WORK ,
241       $                1 , IAROW , IACOL )
242                        DO 40 J = 1 , N - 1
243                            JN = JJ + J - 1
244                            D( JN ) = WORK( J )
245                            E( JN ) = WORK( N + J )
246                            TAU( JN ) = WORK( 2*N + J )
247     40                 CONTINUE
248                        JN = JJ + N - 1
249                        D( JN ) = WORK( N )
250                        TAU( JN ) = ZERO
251                    END IF
252                END IF
253            END IF
254  
255            WORK( 1 ) = REAL( LWMIN )
256  
257            RETURN
258  
259  *         End of PSSYTD2
260  
261        END