Routine: PCGESVD()  File: SRC\pcgesvd.f

 
 
# lines: 648
  # code: 648
  # comment: 0
  # blank:0
# Variables:93
# Callers:0
# Callings:9
# Words:351
# Keywords:213
 

 

.. Scalar Arguments ..
     ..
     .. Array Arguments ..
     ..
  Purpose
  =======
  PCGESVD computes the singular value decomposition (SVD) of an
  M-by-N matrix A, optionally computing the left and/or right
  singular vectors. The SVD is written as
       A = U * SIGMA * transpose(V)
  where SIGMA is an M-by-N matrix which is zero except for its
  min(M,N) diagonal elements, U is an M-by-M orthogonal matrix, and
  V is an N-by-N orthogonal matrix. The diagonal elements of SIGMA
  are the singular values of A and the columns of U and V are the
  corresponding right and left singular vectors, respectively. The
  singular values are returned in array S in decreasing order and
  only the first min(M,N) columns of U and rows of VT = V**T are
  computed.
  Notes
  =====
  Each global data object is described by an associated description
  vector. This vector stores the information required to establish
  the mapping between an object element and its corresponding process
  and memory location.
  Let A be a generic term for any 2D block cyclicly distributed array.
  Such a global array has an associated description vector DESCA.
  In the following comments, the character _ should be read as
  "of the global array".
  NOTATION        STORED IN      EXPLANATION
  --------------- -------------- --------------------------------------
  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                 DTYPE_A = 1.
  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                 the BLACS process grid A is distribu-
                                 ted over. The context itself is glo-
                                 bal, but the handle (the integer
                                 value) may vary.
  M_A    (global) DESCA( M_ )    The number of rows in the global
                                 array A.
  N_A    (global) DESCA( N_ )    The number of columns in the global
                                 array A.
  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                 the rows of the array.
  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                 the columns of the array.
  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                 row of the array A is distributed.
  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                 first column of the array A is
                                 distributed.
  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
  Let K be the number of rows or columns of a distributed matrix, and
  assume that its process grid has dimension r x c. LOCr( K ) denotes
  the number of elements of K that a process would receive if K were
  distributed over the r processes of its process column. Similarly,
  LOCc( K ) denotes the number of elements of K that a process would
  receive if K were distributed over the c processes of its process
  row. The values of LOCr() and LOCc() may be determined via a call
  to the ScaLAPACK tool function, NUMROC:
          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
  An upper bound for these quantities may be computed by:
          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
  Arguments
  =========
          MP = number of local rows in A and U
          NQ = number of local columns in A and VT
          SIZE = min( M, N )
          SIZEQ = number of local columns in U
          SIZEP = number of local rows in VT
  JOBU    (global input) CHARACTER*1
          Specifies options for computing U:
          = 'V':  the first SIZE columns of U (the left singular
                  vectors) are returned in the array U;
          = 'N':  no columns of U (no left singular vectors) are
                  computed.
  JOBVT   (global input) CHARACTER*1
          Specifies options for computing V**T:
          = 'V':  the first SIZE rows of V**T (the right singular
                  vectors) are returned in the array VT;
          = 'N':  no rows of V**T (no right singular vectors) are
                  computed.
  M       (global input) INTEGER
          The number of rows of the input matrix A.  M >= 0.
  N       (global input) INTEGER
          The number of columns of the input matrix A.  N >= 0.
  A       (local input/workspace) block cyclic COMPLEX
          array,
          global dimension (M, N), local dimension (MP, NQ)
          On exit, the contents of A are destroyed.
  IA      (global input) INTEGER
          The row index in the global array A indicating the first
          row of sub( A ).
  JA      (global input) INTEGER
          The column index in the global array A indicating the
          first column of sub( A ).
  DESCA   (global input) INTEGER array of dimension DLEN_
          The array descriptor for the distributed matrix A.
  S       (global output) REAL               array, dimension SIZE
          The singular values of A, sorted so that S(i) >= S(i+1).
  U       (local output) COMPLEX            array, local dimension
          (MP, SIZEQ), global dimension (M, SIZE)
          if JOBU = 'V', U contains the first min(m,n) columns of U
          if JOBU = 'N', U is not referenced.
  IU      (global input) INTEGER
          The row index in the global array U indicating the first
          row of sub( U ).
  JU      (global input) INTEGER
          The column index in the global array U indicating the
          first column of sub( U ).
  DESCU   (global input) INTEGER array of dimension DLEN_
          The array descriptor for the distributed matrix U.
  VT      (local output) COMPLEX            array, local dimension
          (SIZEP, NQ), global dimension (SIZE, N).
          If JOBVT = 'V', VT contains the first SIZE rows of
          V**T. If JOBVT = 'N', VT is not referenced.
  IVT     (global input) INTEGER
          The row index in the global array VT indicating the first
          row of sub( VT ).
  JVT     (global input) INTEGER
          The column index in the global array VT indicating the
          first column of sub( VT ).
  DESCVT   (global input) INTEGER array of dimension DLEN_
          The array descriptor for the distributed matrix VT.
  WORK    (local workspace/output) COMPLEX            array, dimension
          (LWORK)
          On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
  LWORK   (local input) INTEGER
          The dimension of the array WORK.
          LWORK >= 1 + 2*SIZEB + MAX(WATOBD, WBDTOSVD),
          where SIZEB = MAX(M,N), and WATOBD and WBDTOSVD refer,
          respectively, to the workspace required to bidiagonalize
          the matrix A and to go from the bidiagonal matrix to the
          singular value decomposition U*S*VT.
          For WATOBD, the following holds:
          WATOBD = MAX(MAX(WPCLANGE,WPCGEBRD),
                       MAX(WPCLARED2D,WP(pre)LARED1D)),
          where WPCLANGE, WPCLARED1D, WPCLARED2D, WPCGEBRD are the
          workspaces required respectively for the subprograms
          PCLANGE, PSLARED1D, PSLARED2D, PCGEBRD. Using the
          standard notation
          MP = NUMROC( M, MB, MYROW, DESCA( CTXT_ ), NPROW),
          NQ = NUMROC( N, NB, MYCOL, DESCA( LLD_ ), NPCOL),
          the workspaces required for the above subprograms are
          WPCLANGE = MP,
          WPSLARED1D = NQ0,
          WPSLARED2D = MP0,
          WPCGEBRD = NB*(MP + NQ + 1) + NQ,
          where NQ0 and MP0 refer, respectively, to the values obtained
          at MYCOL = 0 and MYROW = 0. In general, the upper limit for
          the workspace is given by a workspace required on
          processor (0,0):
          WATOBD <= NB*(MP0 + NQ0 + 1) + NQ0.
          In case of a homogeneous process grid this upper limit can
          be used as an estimate of the minimum workspace for every
          processor.
          For WBDTOSVD, the following holds:
          WBDTOSVD = SIZE*(WANTU*NRU + WANTVT*NCVT) +
                     MAX(WCBDSQR,
                         MAX(WANTU*WPCORMBRQLN, WANTVT*WPCORMBRPRT)),
          where
                          1, if left(right) singular vectors are wanted
          WANTU(WANTVT) =
                          0, otherwise
          and WCBDSQR, WPCORMBRQLN and WPCORMBRPRT refer respectively
          to the workspace required for the subprograms CBDSQR,
          PCUNMBR(QLN), and PCUNMBR(PRT), where QLN and PRT are the
          values of the arguments VECT, SIDE, and TRANS in the call
          to PCUNMBR. NRU is equal to the local number of rows of
          the matrix U when distributed 1-dimensional "column" of
          processes. Analogously, NCVT is equal to the local number
          of columns of the matrix VT when distributed across
          1-dimensional "row" of processes. Calling the LAPACK
          procedure CBDSQR requires
          WCBDSQR = MAX(1, 4*SIZE )
          on every processor. Finally,
          WPCORMBRQLN = MAX( (NB*(NB-1))/2, (SIZEQ+MP)*NB)+NB*NB,
          WPCORMBRPRT = MAX( (MB*(MB-1))/2, (SIZEP+NQ)*MB )+MB*MB,
          If LWORK = -1, then LWORK is global input and a workspace
          query is assumed; the routine only calculates the minimum
          size for the work array. The required workspace is returned
          as the first element of WORK and no error message is issued
          by PXERBLA.
  RWORK   (workspace) REAL             array, dimension (1+4*SIZEB)
          On exit, if INFO = 0, RWORK(1) returns the necessary size
          for RWORK.
  INFO    (output) INTEGER
          = 0:  successful exit
          < 0:  if INFO = -i, the i-th argument had an illegal value

 
Display dynamic version Find AutoScroll Reload FontSize: - + Hide Comments Hide Blanks Frame FullScreen MailPrint

 
001        SUBROUTINE PCGESVD(JOBU , JOBVT , M , N , A , IA , JA , DESCA , S , U , IU , JU , DESCU ,
002       +VT , IVT , JVT , DESCVT , WORK , LWORK , RWORK , INFO)
003  
004  *     -- ScaLAPACK routine(version 1.7) --
005  *     Univ. of Tennessee , Oak Ridge National Laboratory
006  *     and Univ. of California Berkeley.
007  *     Jan 2006
008  
009  *     > 0 : if CBDSQR did not converge
010  *     If INFO = MIN(M , N) + 1 , then PCGESVD has detected
011  *     heterogeneity by finding that eigenvalues were not
012  *     identical across the process grid. In this case , the
013  *     accuracy of the results from PCGESVD cannot be
014  *     guaranteed.
015  
016  *     === ==================================================================
017  
018  *     The results of PCGEBRD , and therefore PCGESVD , may vary slightly
019  *     from run to run with the same input data. If repeatability is an
020  *     issue , call BLACS_SET with the appropriate option after defining
021  *     the process grid.
022  
023  *     Alignment requirements
024  *     === ===================
025  
026  *     The routine PCGESVD inherits the same alignement requirement as
027  *     the routine PCGEBRD , namely :
028  
029  *     The distributed submatrix sub( A ) must verify some alignment proper -
030  *     ties , namely the following expressions should be true :
031  *     ( MB_A.EQ.NB_A .AND. IROFFA.EQ.ICOFFA )
032  *     where NB = MB_A = NB_A ,
033  *     IROFFA = MOD( IA - 1 , NB ) , ICOFFA = MOD( JA - 1 , NB ) ,
034  
035  *     === ==================================================================
036  
037  *     .. Parameters ..
038        INTEGER BLOCK_CYCLIC_2D , DLEN_ , DTYPE_ , CTXT_ , M_ , N_ , MB_ , NB_ , RSRC_ ,
039       +CSRC_ , LLD_ , ITHVAL
040        PARAMETER(BLOCK_CYCLIC_2D = 1 , DLEN_ = 9 , DTYPE_ = 1 , CTXT_ = 2 , M_ = 3 , N_ = 4 ,
041       +MB_ = 5 , NB_ = 6 , RSRC_ = 7 , CSRC_ = 8 , LLD_ = 9 , ITHVAL = 10)
042        COMPLEX ZERO , ONE
043        PARAMETER(ZERO =((0.0E + 0 , 0.0E + 0)) , ONE =((1.0E + 0 , 0.0E + 0)))
044        REAL DZERO , DONE
045        PARAMETER(DZERO = 0.0D + 0 , DONE = 1.0D + 0)
046  *     ..
047  *     .. Local Scalars ..
048        CHARACTER UPLO
049        INTEGER CONTEXTC , CONTEXTR , I , INDD , INDD2 , INDE , INDE2 , INDTAUP , INDTAUQ ,
050       +INDU , INDV , INDWORK , IOFFD , IOFFE , ISCALE , J , K , LDU , LDVT , LLWORK ,
051       +LWMIN , MAXIM , MB , MP , MYPCOL , MYPCOLC , MYPCOLR , MYPROW , MYPROWC ,
052       +MYPROWR , NB , NCVT , NPCOL , NPCOLC , NPCOLR , NPROCS , NPROW , NPROWC ,
053       +NPROWR , NQ , NRU , SIZE , SIZEB , SIZEP , SIZEPOS , SIZEQ , WANTU , WANTVT ,
054       +WATOBD , WBDTOSVD , WCBDSQR , WPCGEBRD , WPCLANGE , WPCORMBRPRT ,
055       +WPCORMBRQLN
056        REAL ANRM , BIGNUM , EPS , RMAX , RMIN , SAFMIN , SIGMA , SMLNUM
057  *     ..
058  *     .. Local Arrays ..
059        INTEGER DESCTU(DLEN_) , DESCTVT(DLEN_) , IDUM1(3) , IDUM2(3)
060        REAL C(1 , 1)
061  *     ..
062  *     .. External Functions ..
063        LOGICAL LSAME
064        INTEGER NUMROC
065        REAL PSLAMCH , PCLANGE
066        EXTERNAL LSAME , NUMROC , PDLAMCH , PZLANGE  
067  *     ..
068  *     .. External Subroutines ..
069        EXTERNAL BLACS_GET , BLACS_GRIDEXIT , BLACS_GRIDINFO , BLACS_GRIDINIT ,
070       +CHK1MAT , CBDSQR , DESCINIT , SGAMN2D , SGAMX2D , SSCAL , IGAMX2D ,
071       +IGEBR2D , IGEBS2D , PCHK1MAT , PCGEBRD , PCGEMR2D , PSLARED1D ,
072       +PSLARED2D , PCLASCL , PCLASET , PCUNMBR , PXERBLA
073  *     ..
074  *     .. Intrinsic Functions ..
075        INTRINSIC MAX , MIN , SQRT , REAL
076        INTRINSIC CMPLX
077  *     ..
078  *     .. Executable Statements ..
079  *     This is just to keep ftnchek happy
080        IF(BLOCK_CYCLIC_2D*DTYPE_*LLD_*MB_*M_*NB_*N_.LT.0) RETURN
081  
082        CALL BLACS_GRIDINFO(DESCA(CTXT_) , NPROW , NPCOL , MYPROW , MYPCOL)
083        ISCALE = 0
084        INFO = 0
085  
086        IF(NPROW.EQ. - 1) THEN
087            INFO = - (800 + CTXT_)
088        ELSE
089  
090            SIZE = MIN(M , N)
091            SIZEB = MAX(M , N)
092            NPROCS = NPROW*NPCOL
093            IF(M.GE.N) THEN
094                IOFFD = JA - 1
095                IOFFE = IA - 1
096                SIZEPOS = 1
097            ELSE
098                IOFFD = IA - 1
099                IOFFE = JA - 1
100                SIZEPOS = 3
101            END IF
102  
103            IF(LSAME(JOBU , 'V')) THEN
104                WANTU = 1
105            ELSE
106                WANTU = 0
107            END IF
108            IF(LSAME(JOBVT , 'V')) THEN
109                WANTVT = 1
110            ELSE
111                WANTVT = 0
112            END IF
113  
114            CALL CHK1MAT(M , 3 , N , 4 , IA , JA , DESCA , 8 , INFO)
115            IF(WANTU.EQ.1) THEN
116                CALL CHK1MAT(M , 3 , SIZE , SIZEPOS , IU , JU , DESCU , 13 , INFO)
117            END IF
118            IF(WANTVT.EQ.1) THEN
119                CALL CHK1MAT(SIZE , SIZEPOS , N , 4 , IVT , JVT , DESCVT , 17 , INFO)
120            END IF
121            CALL IGAMX2D(DESCA(CTXT_) , 'A' , ' ' , 1 , 1 , INFO , 1 , 1 , 1 ,- 1 ,- 1 , 0)
122  
123            IF(INFO.EQ.0) THEN
124  
125  *             Set up pointers into the WORK array.
126  
127                INDD = 2
128                INDE = INDD + SIZEB + IOFFD
129                INDD2 = INDE + SIZEB + IOFFE
130                INDE2 = INDD2 + SIZEB + IOFFD
131  
132                INDTAUQ = 2
133                INDTAUP = INDTAUQ + SIZEB + JA - 1
134                INDWORK = INDTAUP + SIZEB + IA - 1
135                LLWORK = LWORK - INDWORK + 1
136  
137  *             Initialize contexts for "column" and "row" process matrices.
138  
139                CALL BLACS_GET(DESCA(CTXT_) , 10 , CONTEXTC)
140                CALL BLACS_GRIDINIT(CONTEXTC , 'R' , NPROCS , 1)
141                CALL BLACS_GRIDINFO(CONTEXTC , NPROWC , NPCOLC , MYPROWC ,
142       +        MYPCOLC)
143                CALL BLACS_GET(DESCA(CTXT_) , 10 , CONTEXTR)
144                CALL BLACS_GRIDINIT(CONTEXTR , 'R' , 1 , NPROCS)
145                CALL BLACS_GRIDINFO(CONTEXTR , NPROWR , NPCOLR , MYPROWR ,
146       +        MYPCOLR)
147  
148  *             Set local dimensions of matrices(this is for MB = NB = 1).
149  
150                NRU = NUMROC(M , 1 , MYPROWC , 0 , NPROCS)
151                NCVT = NUMROC(N , 1 , MYPCOLR , 0 , NPROCS)
152                NB = DESCA(NB_)
153                MB = DESCA(MB_)
154                MP = NUMROC(M , MB , MYPROW , DESCA(RSRC_) , NPROW)
155                NQ = NUMROC(N , NB , MYPCOL , DESCA(CSRC_) , NPCOL)
156                IF(WANTVT.EQ.1) THEN
157                    SIZEP = NUMROC(SIZE , DESCVT(MB_) , MYPROW , DESCVT(RSRC_) ,
158       +            NPROW)
159                ELSE
160                    SIZEP = 0
161                END IF
162                IF(WANTU.EQ.1) THEN
163                    SIZEQ = NUMROC(SIZE , DESCU(NB_) , MYPCOL , DESCU(CSRC_) ,
164       +            NPCOL)
165                ELSE
166                    SIZEQ = 0
167                END IF
168  
169  *             Transmit MAX(NQ0 , MP0).
170  
171                IF(MYPROW.EQ.0 .AND. MYPCOL.EQ.0) THEN
172                    MAXIM = MAX(NQ , MP)
173                    CALL IGEBS2D(DESCA(CTXT_) , 'All' , ' ' , 1 , 1 , MAXIM , 1)
174                ELSE
175                    CALL IGEBR2D(DESCA(CTXT_) , 'All' , ' ' , 1 , 1 , MAXIM , 1 , 0 , 0)
176                END IF
177  
178                WPCLANGE = MP
179                WPCGEBRD = NB* (MP + NQ + 1) + NQ
180                WATOBD = MAX(MAX(WPCLANGE , WPCGEBRD) , MAXIM)
181  
182                WCBDSQR = MAX(1 , 4*SIZE)
183                WPCORMBRQLN = MAX((NB* (NB - 1)) / 2 ,(SIZEQ + MP)*NB) + NB*NB
184                WPCORMBRPRT = MAX((MB* (MB - 1)) / 2 ,(SIZEP + NQ)*MB) + MB*MB
185                WBDTOSVD = SIZE* (WANTU*NRU + WANTVT*NCVT) +
186       +        MAX(WCBDSQR , MAX(WANTU*WPCORMBRQLN ,
187       +        WANTVT*WPCORMBRPRT))
188  
189  *             Finally , calculate required workspace.
190  
191                LWMIN = 1 + 2*SIZEB + MAX(WATOBD , WBDTOSVD)
192                WORK(1) = CMPLX(LWMIN , 0D + 00)
193                RWORK(1) = REAL(1 + 4*SIZEB)
194  
195                IF(WANTU.NE.1 .AND. .NOT.(LSAME(JOBU , 'N'))) THEN
196                    INFO = - 1
197                ELSE IF(WANTVT.NE.1 .AND. .NOT.(LSAME(JOBVT , 'N'))) THEN
198                    INFO = - 2
199                ELSE IF(LWORK.LT.LWMIN .AND. LWORK.NE. - 1) THEN
200                    INFO = - 19
201                END IF
202  
203            END IF
204  
205            IDUM1(1) = WANTU
206            IDUM1(2) = WANTVT
207            IF(LWORK.EQ. - 1) THEN
208                IDUM1(3) = - 1
209            ELSE
210                IDUM1(3) = 1
211            END IF
212            IDUM2(1) = 1
213            IDUM2(2) = 2
214            IDUM2(3) = 19
215            CALL PCHK1MAT(M , 3 , N , 4 , IA , JA , DESCA , 8 , 3 , IDUM1 , IDUM2 , INFO)
216            IF(INFO.EQ.0) THEN
217                IF(WANTU.EQ.1) THEN
218                    CALL PCHK1MAT(M , 3 , SIZE , 4 , IU , JU , DESCU , 13 , 0 , IDUM1 , IDUM2 ,
219       +            INFO)
220                END IF
221                IF(WANTVT.EQ.1) THEN
222                    CALL PCHK1MAT(SIZE , 3 , N , 4 , IVT , JVT , DESCVT , 17 , 0 , IDUM1 ,
223       +            IDUM2 , INFO)
224                END IF
225            END IF
226  
227        END IF
228  
229        IF(INFO.NE.0) THEN
230            CALL PXERBLA(DESCA(CTXT_) , 'PCGESVD' ,- INFO)
231            RETURN
232        ELSE IF(LWORK.EQ. - 1) THEN
233            GO TO 40
234        END IF
235  
236  *     Quick return if possible.
237  
238        IF(M.LE.0 .OR. N.LE.0) GO TO 40
239  
240  *     Get machine constants.
241  
242        SAFMIN = PSLAMCH(DESCA(CTXT_) , 'Safe minimum')
243        EPS = PSLAMCH(DESCA(CTXT_) , 'Precision')
244        SMLNUM = SAFMIN / EPS
245        BIGNUM = DONE / SMLNUM
246        RMIN = SQRT(SMLNUM)
247        RMAX = MIN(SQRT(BIGNUM) , DONE / SQRT(SQRT(SAFMIN)))
248  
249  *     Scale matrix to allowable range , if necessary.
250  
251        ANRM = PCLANGE('1' , M , N , A , IA , JA , DESCA , WORK(INDWORK))
252        IF(ANRM.GT.DZERO .AND. ANRM.LT.RMIN) THEN
253            ISCALE = 1
254            SIGMA = RMIN / ANRM
255        ELSE IF(ANRM.GT.RMAX) THEN
256            ISCALE = 1
257            SIGMA = RMAX / ANRM
258        END IF
259  
260        IF(ISCALE.EQ.1) THEN
261            CALL PCLASCL ('G' , DONE , SIGMA , M , N , A , IA , JA , DESCA , INFO)
262        END IF
263  
264        CALL PCGEBRD (M , N , A , IA , JA , DESCA , RWORK(INDD) , RWORK(INDE) ,
265       +WORK(INDTAUQ) , WORK(INDTAUP) , WORK(INDWORK) , LLWORK ,
266       +INFO)
267  
268  *     Copy D and E to all processes.
269  *     Array D is in local array of dimension :
270  *     LOCc(JA + MIN(M , N) - 1) if M >= N ; LOCr(IA + MIN(M , N) - 1) otherwise.
271  *     Array E is in local array of dimension
272  *     LOCr(IA + MIN(M , N) - 1) if M >= N ; LOCc(JA + MIN(M , N) - 2) otherwise.
273  
274        IF(M.GE.N) THEN
275  *         Distribute D
276            CALL PSLARED1D (N + IOFFD , IA , JA , DESCA , RWORK(INDD) , RWORK(INDD2) ,
277       +    WORK(INDWORK) , LLWORK)
278  *         Distribute E
279            CALL PSLARED2D (M + IOFFE , IA , JA , DESCA , RWORK(INDE) , RWORK(INDE2) ,
280       +    WORK(INDWORK) , LLWORK)
281        ELSE
282  *         Distribute D
283            CALL PSLARED2D (M + IOFFD , IA , JA , DESCA , RWORK(INDD) , RWORK(INDD2) ,
284       +    WORK(INDWORK) , LLWORK)
285  *         Distribute E
286            CALL PSLARED1D (N + IOFFE , IA , JA , DESCA , RWORK(INDE) , RWORK(INDE2) ,
287       +    WORK(INDWORK) , LLWORK)
288        END IF
289  
290  *     Prepare for calling PCBDSQR.
291  
292        IF(M.GE.N) THEN
293            UPLO = 'U'
294        ELSE
295            UPLO = 'L'
296        END IF
297  
298        INDU = INDWORK
299        INDV = INDU + SIZE*NRU*WANTU
300        INDWORK = INDV + SIZE*NCVT*WANTVT
301  
302        LDU = MAX(1 , NRU)
303        LDVT = MAX(1 , SIZE)
304  
305        CALL DESCINIT(DESCTU , M , SIZE , 1 , 1 , 0 , 0 , CONTEXTC , LDU , INFO)
306        CALL DESCINIT(DESCTVT , SIZE , N , 1 , 1 , 0 , 0 , CONTEXTR , LDVT , INFO)
307  
308        IF(WANTU.EQ.1) THEN
309            CALL PCLASET ('Full' , M , SIZE , ZERO , ONE , WORK(INDU) , 1 , 1 , DESCTU)
310        ELSE
311            NRU = 0
312        END IF
313  
314        IF(WANTVT.EQ.1) THEN
315            CALL PCLASET ('Full' , SIZE , N , ZERO , ONE , WORK(INDV) , 1 , 1 , DESCTVT)
316        ELSE
317            NCVT = 0
318        END IF
319  
320        CALL CBDSQR(UPLO , SIZE , NCVT , NRU , 0 , RWORK(INDD2 + IOFFD) ,
321       +RWORK(INDE2 + IOFFE) , WORK(INDV) , SIZE , WORK(INDU) , LDU , C , 1 ,
322       +WORK(INDWORK) , INFO)
323  
324  *     Redistribute elements of U and VT in the block - cyclic fashion.
325  
326        IF(WANTU.EQ.1) CALL PCGEMR2D(M , SIZE , WORK(INDU) , 1 , 1 , DESCTU , U , IU ,
327       +JU , DESCU , DESCU(CTXT_))
328  
329        IF(WANTVT.EQ.1) CALL PCGEMR2D(SIZE , N , WORK(INDV) , 1 , 1 , DESCTVT , VT ,
330       +IVT , JVT , DESCVT , DESCVT(CTXT_))
331  
332  *     Set to ZERO "non-square" elements of the larger matrices U , VT.
333  
334        IF(M.GT.N .AND. WANTU.EQ.1) THEN
335            CALL PCLASET ('Full' , M - SIZE , SIZE , ZERO , ZERO , U , IA + SIZE , JU , DESCU)
336        ELSE IF(N.GT.M .AND. WANTVT.EQ.1) THEN
337            CALL PCLASET ('Full' , SIZE , N - SIZE , ZERO , ZERO , VT , IVT , JVT + SIZE ,
338       +    DESCVT)
339        END IF
340  
341  *     Multiply Householder rotations from bidiagonalized matrix.
342  
343        IF(WANTU.EQ.1) CALL PCUNMBR ('Q' , 'L' , 'N' , M , SIZE , N , A , IA , JA , DESCA ,
344       +WORK(INDTAUQ) , U , IU , JU , DESCU ,
345       +WORK(INDWORK) , LLWORK , INFO)
346  
347        IF(WANTVT.EQ.1) CALL PCUNMBR ('P' , 'R' , 'C' , SIZE , N , M , A , IA , JA , DESCA ,
348       +WORK(INDTAUP) , VT , IVT , JVT , DESCVT ,
349       +WORK(INDWORK) , LLWORK , INFO)
350  
351  *     Copy singular values into output array S.
352  
353        DO 10 I = 1 , SIZE
354            S(I) = RWORK(INDD2 + IOFFD + I - 1)
355     10 CONTINUE
356  
357  *     If matrix was scaled , then rescale singular values appropriately.
358  
359        IF(ISCALE.EQ.1) THEN
360            CALL SSCAL(SIZE , ONE / SIGMA , S , 1)
361        END IF
362  
363  *     Compare every ith eigenvalue , or all if there are only a few ,
364  *     across the process grid to check for heterogeneity.
365  
366        IF(SIZE.LE.ITHVAL) THEN
367            J = SIZE
368            K = 1
369        ELSE
370            J = SIZE / ITHVAL
371            K = ITHVAL
372        END IF
373  
374        DO 20 I = 1 , J
375            RWORK(I + INDE) = S((I - 1)*K + 1)
376            RWORK(I + INDD2) = S((I - 1)*K + 1)
377     20 CONTINUE
378  
379        CALL SGAMN2D(DESCA(CTXT_) , 'a' , ' ' , J , 1 , RWORK(1 + INDE) , J , 1 , 1 ,- 1 ,- 1 , 0)
380        CALL SGAMX2D(DESCA(CTXT_) , 'a' , ' ' , J , 1 , RWORK(1 + INDD2) , J , 1 , 1 ,- 1 ,- 1 ,
381       +0)
382  
383        DO 30 I = 1 , J
384            IF((RWORK(I + INDE) - RWORK(I + INDD2)).NE.DZERO) THEN
385                INFO = SIZE + 1
386            END IF
387     30 CONTINUE
388  
389     40 CONTINUE
390  
391        CALL BLACS_GRIDEXIT(CONTEXTC)
392        CALL BLACS_GRIDEXIT(CONTEXTR)
393  
394  *     End of PCGESVD
395  
396        RETURN
397        END