LAPACK  3.7.0
LAPACK: Linear Algebra PACKage
subroutine cgesvj ( character*1  JOBA,
character*1  JOBU,
character*1  JOBV,
integer  M,
integer  N,
complex, dimension( lda, * )  A,
integer  LDA,
real, dimension( n )  SVA,
integer  MV,
complex, dimension( ldv, * )  V,
integer  LDV,
complex, dimension( lwork )  CWORK,
integer  LWORK,
real, dimension( lrwork )  RWORK,
integer  LRWORK,
integer  INFO 
)

CGESVJ

Download CGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 CGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER* 1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U' or 'F': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' or 'J': the matrix V is computed and returned in the array V
          = 'A' : the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly; instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N' : the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C':
                 If INFO .EQ. 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold SLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU.EQ.'C'),
                 see the description of JOBU.
                 If INFO .GT. 0,
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU .EQ. 'N':
                 If INFO .EQ. 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO .GT. 0 :
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is REAL array, dimension (N)
          On exit,
          If INFO .EQ. 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE .EQ. ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO .GT. 0 :
          the procedure CGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV .EQ. 'A', then the product of Jacobi rotations in CGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV .GE. 1.
          If JOBV .EQ. 'V', then LDV .GE. max(1,N).
          If JOBV .EQ. 'A', then LDV .GE. max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX array, dimension max(1,LWORK).
          Used as workspace.
          If on entry LWORK .EQ. -1, then a workspace query is assumed and
          no computation is done; CWORK(1) is set to the minial (and optimal)
          length of CWORK.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is REAL array, dimension max(6,LRWORK).
          On entry,
          If JOBU .EQ. 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=SLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when CGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is stil useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
         If on entry LRWORK .EQ. -1, then a workspace query is assumed and
         no computation is done; RWORK(1) is set to the minial (and optimal)
         length of RWORK.
[in]LRWORK
         LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0 : successful exit.
          < 0 : if INFO = -i, then the i-th argument had an illegal value
          > 0 : CGESVJ did not converge in the maximal allowed number
                (NSWEEP=30) of sweeps. The output may still be useful.
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Date
June 2016
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tunning parameters (marked with [TP]) are available for the
 implementer.
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributor:
  ============

  Zlatko Drmac (Zagreb, Croatia)
References:
[1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer. SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic. SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. LAPACK Working note 169. [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. LAPACK Working note 170. [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations. Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  [email protected]. Thank you.

Definition at line 351 of file cgesvj.f.

351 *
352 * -- LAPACK computational routine (version 3.7.0) --
353 * -- LAPACK is a software package provided by Univ. of Tennessee, --
354 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355 * June 2016
356 *
357  IMPLICIT NONE
358 * .. Scalar Arguments ..
359  INTEGER info, lda, ldv, lwork, lrwork, m, mv, n
360  CHARACTER*1 joba, jobu, jobv
361 * ..
362 * .. Array Arguments ..
363  COMPLEX a( lda, * ), v( ldv, * ), cwork( lwork )
364  REAL rwork( lrwork ), sva( n )
365 * ..
366 *
367 * =====================================================================
368 *
369 * .. Local Parameters ..
370  REAL zero, half, one
371  parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
372  COMPLEX czero, cone
373  parameter( czero = (0.0e0, 0.0e0), cone = (1.0e0, 0.0e0) )
374  INTEGER nsweep
375  parameter( nsweep = 30 )
376 * ..
377 * .. Local Scalars ..
378  COMPLEX aapq, ompq
379  REAL aapp, aapp0, aapq1, aaqq, apoaq, aqoap, big,
380  $ bigtheta, cs, ctol, epsln, mxaapq,
381  $ mxsinj, rootbig, rooteps, rootsfmin, roottol,
382  $ skl, sfmin, small, sn, t, temp1, theta, thsign, tol
383  INTEGER blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
384  $ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
385  $ n4, nbl, notrot, p, pskipped, q, rowskip, swband
386  LOGICAL applv, goscale, lower, lquery, lsvec, noscale, rotok,
387  $ rsvec, uctol, upper
388 * ..
389 * ..
390 * .. Intrinsic Functions ..
391  INTRINSIC abs, max, min, conjg, REAL, sign, sqrt
392 * ..
393 * .. External Functions ..
394 * ..
395 * from BLAS
396  REAL scnrm2
397  COMPLEX cdotc
398  EXTERNAL cdotc, scnrm2
399  INTEGER isamax
400  EXTERNAL isamax
401 * from LAPACK
402  REAL slamch
403  EXTERNAL slamch
404  LOGICAL lsame
405  EXTERNAL lsame
406 * ..
407 * .. External Subroutines ..
408 * ..
409 * from BLAS
410  EXTERNAL ccopy, crot, csscal, cswap
411 * from LAPACK
412  EXTERNAL clascl, claset, classq, slascl, xerbla
413  EXTERNAL cgsvj0, cgsvj1
414 * ..
415 * .. Executable Statements ..
416 *
417 * Test the input arguments
418 *
419  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
420  uctol = lsame( jobu, 'C' )
421  rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
422  applv = lsame( jobv, 'A' )
423  upper = lsame( joba, 'U' )
424  lower = lsame( joba, 'L' )
425 *
426  lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
427  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
428  info = -1
429  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
430  info = -2
431  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
432  info = -3
433  ELSE IF( m.LT.0 ) THEN
434  info = -4
435  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
436  info = -5
437  ELSE IF( lda.LT.m ) THEN
438  info = -7
439  ELSE IF( mv.LT.0 ) THEN
440  info = -9
441  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
442  $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
443  info = -11
444  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
445  info = -12
446  ELSE IF( lwork.LT.( m+n ) .AND. ( .NOT.lquery ) ) THEN
447  info = -13
448  ELSE IF( lrwork.LT.max( n, 6 ) .AND. ( .NOT.lquery ) ) THEN
449  info = -15
450  ELSE
451  info = 0
452  END IF
453 *
454 * #:(
455  IF( info.NE.0 ) THEN
456  CALL xerbla( 'CGESVJ', -info )
457  RETURN
458  ELSE IF ( lquery ) THEN
459  cwork(1) = m + n
460  rwork(1) = max( n, 6 )
461  RETURN
462  END IF
463 *
464 * #:) Quick return for void matrix
465 *
466  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
467 *
468 * Set numerical parameters
469 * The stopping criterion for Jacobi rotations is
470 *
471 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
472 *
473 * where EPS is the round-off and CTOL is defined as follows:
474 *
475  IF( uctol ) THEN
476 * ... user controlled
477  ctol = rwork( 1 )
478  ELSE
479 * ... default
480  IF( lsvec .OR. rsvec .OR. applv ) THEN
481  ctol = sqrt( REAL( M ) )
482  ELSE
483  ctol = REAL( m )
484  END IF
485  END IF
486 * ... and the machine dependent parameters are
487 *[!] (Make sure that SLAMCH() works properly on the target machine.)
488 *
489  epsln = slamch( 'Epsilon' )
490  rooteps = sqrt( epsln )
491  sfmin = slamch( 'SafeMinimum' )
492  rootsfmin = sqrt( sfmin )
493  small = sfmin / epsln
494 * BIG = SLAMCH( 'Overflow' )
495  big = one / sfmin
496  rootbig = one / rootsfmin
497 * LARGE = BIG / SQRT( REAL( M*N ) )
498  bigtheta = one / rooteps
499 *
500  tol = ctol*epsln
501  roottol = sqrt( tol )
502 *
503  IF( REAL( m )*epsln.GE.one ) then
504  info = -4
505  CALL xerbla( 'CGESVJ', -info )
506  RETURN
507  END IF
508 *
509 * Initialize the right singular vector matrix.
510 *
511  IF( rsvec ) THEN
512  mvl = n
513  CALL claset( 'A', mvl, n, czero, cone, v, ldv )
514  ELSE IF( applv ) THEN
515  mvl = mv
516  END IF
517  rsvec = rsvec .OR. applv
518 *
519 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
520 *(!) If necessary, scale A to protect the largest singular value
521 * from overflow. It is possible that saving the largest singular
522 * value destroys the information about the small ones.
523 * This initial scaling is almost minimal in the sense that the
524 * goal is to make sure that no column norm overflows, and that
525 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
526 * in A are detected, the procedure returns with INFO=-6.
527 *
528  skl = one / sqrt( REAL( m )*REAL( N ) )
529  noscale = .true.
530  goscale = .true.
531 *
532  IF( lower ) THEN
533 * the input matrix is M-by-N lower triangular (trapezoidal)
534  DO 1874 p = 1, n
535  aapp = zero
536  aaqq = one
537  CALL classq( m-p+1, a( p, p ), 1, aapp, aaqq )
538  IF( aapp.GT.big ) THEN
539  info = -6
540  CALL xerbla( 'CGESVJ', -info )
541  RETURN
542  END IF
543  aaqq = sqrt( aaqq )
544  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
545  sva( p ) = aapp*aaqq
546  ELSE
547  noscale = .false.
548  sva( p ) = aapp*( aaqq*skl )
549  IF( goscale ) THEN
550  goscale = .false.
551  DO 1873 q = 1, p - 1
552  sva( q ) = sva( q )*skl
553  1873 CONTINUE
554  END IF
555  END IF
556  1874 CONTINUE
557  ELSE IF( upper ) THEN
558 * the input matrix is M-by-N upper triangular (trapezoidal)
559  DO 2874 p = 1, n
560  aapp = zero
561  aaqq = one
562  CALL classq( p, a( 1, p ), 1, aapp, aaqq )
563  IF( aapp.GT.big ) THEN
564  info = -6
565  CALL xerbla( 'CGESVJ', -info )
566  RETURN
567  END IF
568  aaqq = sqrt( aaqq )
569  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
570  sva( p ) = aapp*aaqq
571  ELSE
572  noscale = .false.
573  sva( p ) = aapp*( aaqq*skl )
574  IF( goscale ) THEN
575  goscale = .false.
576  DO 2873 q = 1, p - 1
577  sva( q ) = sva( q )*skl
578  2873 CONTINUE
579  END IF
580  END IF
581  2874 CONTINUE
582  ELSE
583 * the input matrix is M-by-N general dense
584  DO 3874 p = 1, n
585  aapp = zero
586  aaqq = one
587  CALL classq( m, a( 1, p ), 1, aapp, aaqq )
588  IF( aapp.GT.big ) THEN
589  info = -6
590  CALL xerbla( 'CGESVJ', -info )
591  RETURN
592  END IF
593  aaqq = sqrt( aaqq )
594  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
595  sva( p ) = aapp*aaqq
596  ELSE
597  noscale = .false.
598  sva( p ) = aapp*( aaqq*skl )
599  IF( goscale ) THEN
600  goscale = .false.
601  DO 3873 q = 1, p - 1
602  sva( q ) = sva( q )*skl
603  3873 CONTINUE
604  END IF
605  END IF
606  3874 CONTINUE
607  END IF
608 *
609  IF( noscale )skl = one
610 *
611 * Move the smaller part of the spectrum from the underflow threshold
612 *(!) Start by determining the position of the nonzero entries of the
613 * array SVA() relative to ( SFMIN, BIG ).
614 *
615  aapp = zero
616  aaqq = big
617  DO 4781 p = 1, n
618  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
619  aapp = max( aapp, sva( p ) )
620  4781 CONTINUE
621 *
622 * #:) Quick return for zero matrix
623 *
624  IF( aapp.EQ.zero ) THEN
625  IF( lsvec )CALL claset( 'G', m, n, czero, cone, a, lda )
626  rwork( 1 ) = one
627  rwork( 2 ) = zero
628  rwork( 3 ) = zero
629  rwork( 4 ) = zero
630  rwork( 5 ) = zero
631  rwork( 6 ) = zero
632  RETURN
633  END IF
634 *
635 * #:) Quick return for one-column matrix
636 *
637  IF( n.EQ.1 ) THEN
638  IF( lsvec )CALL clascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
639  $ a( 1, 1 ), lda, ierr )
640  rwork( 1 ) = one / skl
641  IF( sva( 1 ).GE.sfmin ) THEN
642  rwork( 2 ) = one
643  ELSE
644  rwork( 2 ) = zero
645  END IF
646  rwork( 3 ) = zero
647  rwork( 4 ) = zero
648  rwork( 5 ) = zero
649  rwork( 6 ) = zero
650  RETURN
651  END IF
652 *
653 * Protect small singular values from underflow, and try to
654 * avoid underflows/overflows in computing Jacobi rotations.
655 *
656  sn = sqrt( sfmin / epsln )
657  temp1 = sqrt( big / REAL( N ) )
658  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
659  $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
660  temp1 = min( big, temp1 / aapp )
661 * AAQQ = AAQQ*TEMP1
662 * AAPP = AAPP*TEMP1
663  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
664  temp1 = min( sn / aaqq, big / ( aapp*sqrt( REAL( N ) ) ) )
665 * AAQQ = AAQQ*TEMP1
666 * AAPP = AAPP*TEMP1
667  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
668  temp1 = max( sn / aaqq, temp1 / aapp )
669 * AAQQ = AAQQ*TEMP1
670 * AAPP = AAPP*TEMP1
671  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
672  temp1 = min( sn / aaqq, big / ( sqrt( REAL( N ) )*aapp ) )
673 * AAQQ = AAQQ*TEMP1
674 * AAPP = AAPP*TEMP1
675  ELSE
676  temp1 = one
677  END IF
678 *
679 * Scale, if necessary
680 *
681  IF( temp1.NE.one ) THEN
682  CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
683  END IF
684  skl = temp1*skl
685  IF( skl.NE.one ) THEN
686  CALL clascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
687  skl = one / skl
688  END IF
689 *
690 * Row-cyclic Jacobi SVD algorithm with column pivoting
691 *
692  emptsw = ( n*( n-1 ) ) / 2
693  notrot = 0
694 
695  DO 1868 q = 1, n
696  cwork( q ) = cone
697  1868 CONTINUE
698 *
699 *
700 *
701  swband = 3
702 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
703 * if CGESVJ is used as a computational routine in the preconditioned
704 * Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure
705 * works on pivots inside a band-like region around the diagonal.
706 * The boundaries are determined dynamically, based on the number of
707 * pivots above a threshold.
708 *
709  kbl = min( 8, n )
710 *[TP] KBL is a tuning parameter that defines the tile size in the
711 * tiling of the p-q loops of pivot pairs. In general, an optimal
712 * value of KBL depends on the matrix dimensions and on the
713 * parameters of the computer's memory.
714 *
715  nbl = n / kbl
716  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
717 *
718  blskip = kbl**2
719 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
720 *
721  rowskip = min( 5, kbl )
722 *[TP] ROWSKIP is a tuning parameter.
723 *
724  lkahead = 1
725 *[TP] LKAHEAD is a tuning parameter.
726 *
727 * Quasi block transformations, using the lower (upper) triangular
728 * structure of the input matrix. The quasi-block-cycling usually
729 * invokes cubic convergence. Big part of this cycle is done inside
730 * canonical subspaces of dimensions less than M.
731 *
732  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
733 *[TP] The number of partition levels and the actual partition are
734 * tuning parameters.
735  n4 = n / 4
736  n2 = n / 2
737  n34 = 3*n4
738  IF( applv ) THEN
739  q = 0
740  ELSE
741  q = 1
742  END IF
743 *
744  IF( lower ) THEN
745 *
746 * This works very well on lower triangular matrices, in particular
747 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
748 * The idea is simple:
749 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
750 * [+ + 0 0] [0 0]
751 * [+ + x 0] actually work on [x 0] [x 0]
752 * [+ + x x] [x x]. [x x]
753 *
754  CALL cgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
755  $ cwork( n34+1 ), sva( n34+1 ), mvl,
756  $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
757  $ 2, cwork( n+1 ), lwork-n, ierr )
758 
759  CALL cgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
760  $ cwork( n2+1 ), sva( n2+1 ), mvl,
761  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
762  $ cwork( n+1 ), lwork-n, ierr )
763 
764  CALL cgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
765  $ cwork( n2+1 ), sva( n2+1 ), mvl,
766  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
767  $ cwork( n+1 ), lwork-n, ierr )
768 *
769  CALL cgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
770  $ cwork( n4+1 ), sva( n4+1 ), mvl,
771  $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
772  $ cwork( n+1 ), lwork-n, ierr )
773 *
774  CALL cgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
775  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
776  $ ierr )
777 *
778  CALL cgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
779  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
780  $ lwork-n, ierr )
781 *
782 *
783  ELSE IF( upper ) THEN
784 *
785 *
786  CALL cgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
787  $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
788  $ ierr )
789 *
790  CALL cgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
791  $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
792  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
793  $ ierr )
794 *
795  CALL cgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
796  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
797  $ lwork-n, ierr )
798 *
799  CALL cgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
800  $ cwork( n2+1 ), sva( n2+1 ), mvl,
801  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
802  $ cwork( n+1 ), lwork-n, ierr )
803 
804  END IF
805 *
806  END IF
807 *
808 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
809 *
810  DO 1993 i = 1, nsweep
811 *
812 * .. go go go ...
813 *
814  mxaapq = zero
815  mxsinj = zero
816  iswrot = 0
817 *
818  notrot = 0
819  pskipped = 0
820 *
821 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
822 * 1 <= p < q <= N. This is the first step toward a blocked implementation
823 * of the rotations. New implementation, based on block transformations,
824 * is under development.
825 *
826  DO 2000 ibr = 1, nbl
827 *
828  igl = ( ibr-1 )*kbl + 1
829 *
830  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
831 *
832  igl = igl + ir1*kbl
833 *
834  DO 2001 p = igl, min( igl+kbl-1, n-1 )
835 *
836 * .. de Rijk's pivoting
837 *
838  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
839  IF( p.NE.q ) THEN
840  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
841  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1,
842  $ v( 1, q ), 1 )
843  temp1 = sva( p )
844  sva( p ) = sva( q )
845  sva( q ) = temp1
846  aapq = cwork(p)
847  cwork(p) = cwork(q)
848  cwork(q) = aapq
849  END IF
850 *
851  IF( ir1.EQ.0 ) THEN
852 *
853 * Column norms are periodically updated by explicit
854 * norm computation.
855 *[!] Caveat:
856 * Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1)
857 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
858 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
859 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
860 * Hence, SCNRM2 cannot be trusted, not even in the case when
861 * the true norm is far from the under(over)flow boundaries.
862 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
863 * below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )".
864 *
865  IF( ( sva( p ).LT.rootbig ) .AND.
866  $ ( sva( p ).GT.rootsfmin ) ) THEN
867  sva( p ) = scnrm2( m, a( 1, p ), 1 )
868  ELSE
869  temp1 = zero
870  aapp = one
871  CALL classq( m, a( 1, p ), 1, temp1, aapp )
872  sva( p ) = temp1*sqrt( aapp )
873  END IF
874  aapp = sva( p )
875  ELSE
876  aapp = sva( p )
877  END IF
878 *
879  IF( aapp.GT.zero ) THEN
880 *
881  pskipped = 0
882 *
883  DO 2002 q = p + 1, min( igl+kbl-1, n )
884 *
885  aaqq = sva( q )
886 *
887  IF( aaqq.GT.zero ) THEN
888 *
889  aapp0 = aapp
890  IF( aaqq.GE.one ) THEN
891  rotok = ( small*aapp ).LE.aaqq
892  IF( aapp.LT.( big / aaqq ) ) THEN
893  aapq = ( cdotc( m, a( 1, p ), 1,
894  $ a( 1, q ), 1 ) / aaqq ) / aapp
895  ELSE
896  CALL ccopy( m, a( 1, p ), 1,
897  $ cwork(n+1), 1 )
898  CALL clascl( 'G', 0, 0, aapp, one,
899  $ m, 1, cwork(n+1), lda, ierr )
900  aapq = cdotc( m, cwork(n+1), 1,
901  $ a( 1, q ), 1 ) / aaqq
902  END IF
903  ELSE
904  rotok = aapp.LE.( aaqq / small )
905  IF( aapp.GT.( small / aaqq ) ) THEN
906  aapq = ( cdotc( m, a( 1, p ), 1,
907  $ a( 1, q ), 1 ) / aapp ) / aaqq
908  ELSE
909  CALL ccopy( m, a( 1, q ), 1,
910  $ cwork(n+1), 1 )
911  CALL clascl( 'G', 0, 0, aaqq,
912  $ one, m, 1,
913  $ cwork(n+1), lda, ierr )
914  aapq = cdotc( m, a(1, p ), 1,
915  $ cwork(n+1), 1 ) / aapp
916  END IF
917  END IF
918 *
919 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
920  aapq1 = -abs(aapq)
921  mxaapq = max( mxaapq, -aapq1 )
922 *
923 * TO rotate or NOT to rotate, THAT is the question ...
924 *
925  IF( abs( aapq1 ).GT.tol ) THEN
926  ompq = aapq / abs(aapq)
927 *
928 * .. rotate
929 *[RTD] ROTATED = ROTATED + ONE
930 *
931  IF( ir1.EQ.0 ) THEN
932  notrot = 0
933  pskipped = 0
934  iswrot = iswrot + 1
935  END IF
936 *
937  IF( rotok ) THEN
938 *
939  aqoap = aaqq / aapp
940  apoaq = aapp / aaqq
941  theta = -half*abs( aqoap-apoaq )/aapq1
942 *
943  IF( abs( theta ).GT.bigtheta ) THEN
944 *
945  t = half / theta
946  cs = one
947 
948  CALL crot( m, a(1,p), 1, a(1,q), 1,
949  $ cs, conjg(ompq)*t )
950  IF ( rsvec ) THEN
951  CALL crot( mvl, v(1,p), 1,
952  $ v(1,q), 1, cs, conjg(ompq)*t )
953  END IF
954 
955  sva( q ) = aaqq*sqrt( max( zero,
956  $ one+t*apoaq*aapq1 ) )
957  aapp = aapp*sqrt( max( zero,
958  $ one-t*aqoap*aapq1 ) )
959  mxsinj = max( mxsinj, abs( t ) )
960 *
961  ELSE
962 *
963 * .. choose correct signum for THETA and rotate
964 *
965  thsign = -sign( one, aapq1 )
966  t = one / ( theta+thsign*
967  $ sqrt( one+theta*theta ) )
968  cs = sqrt( one / ( one+t*t ) )
969  sn = t*cs
970 *
971  mxsinj = max( mxsinj, abs( sn ) )
972  sva( q ) = aaqq*sqrt( max( zero,
973  $ one+t*apoaq*aapq1 ) )
974  aapp = aapp*sqrt( max( zero,
975  $ one-t*aqoap*aapq1 ) )
976 *
977  CALL crot( m, a(1,p), 1, a(1,q), 1,
978  $ cs, conjg(ompq)*sn )
979  IF ( rsvec ) THEN
980  CALL crot( mvl, v(1,p), 1,
981  $ v(1,q), 1, cs, conjg(ompq)*sn )
982  END IF
983  END IF
984  cwork(p) = -cwork(q) * ompq
985 *
986  ELSE
987 * .. have to use modified Gram-Schmidt like transformation
988  CALL ccopy( m, a( 1, p ), 1,
989  $ cwork(n+1), 1 )
990  CALL clascl( 'G', 0, 0, aapp, one, m,
991  $ 1, cwork(n+1), lda,
992  $ ierr )
993  CALL clascl( 'G', 0, 0, aaqq, one, m,
994  $ 1, a( 1, q ), lda, ierr )
995  CALL caxpy( m, -aapq, cwork(n+1), 1,
996  $ a( 1, q ), 1 )
997  CALL clascl( 'G', 0, 0, one, aaqq, m,
998  $ 1, a( 1, q ), lda, ierr )
999  sva( q ) = aaqq*sqrt( max( zero,
1000  $ one-aapq1*aapq1 ) )
1001  mxsinj = max( mxsinj, sfmin )
1002  END IF
1003 * END IF ROTOK THEN ... ELSE
1004 *
1005 * In the case of cancellation in updating SVA(q), SVA(p)
1006 * recompute SVA(q), SVA(p).
1007 *
1008  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1009  $ THEN
1010  IF( ( aaqq.LT.rootbig ) .AND.
1011  $ ( aaqq.GT.rootsfmin ) ) THEN
1012  sva( q ) = scnrm2( m, a( 1, q ), 1 )
1013  ELSE
1014  t = zero
1015  aaqq = one
1016  CALL classq( m, a( 1, q ), 1, t,
1017  $ aaqq )
1018  sva( q ) = t*sqrt( aaqq )
1019  END IF
1020  END IF
1021  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1022  IF( ( aapp.LT.rootbig ) .AND.
1023  $ ( aapp.GT.rootsfmin ) ) THEN
1024  aapp = scnrm2( m, a( 1, p ), 1 )
1025  ELSE
1026  t = zero
1027  aapp = one
1028  CALL classq( m, a( 1, p ), 1, t,
1029  $ aapp )
1030  aapp = t*sqrt( aapp )
1031  END IF
1032  sva( p ) = aapp
1033  END IF
1034 *
1035  ELSE
1036 * A(:,p) and A(:,q) already numerically orthogonal
1037  IF( ir1.EQ.0 )notrot = notrot + 1
1038 *[RTD] SKIPPED = SKIPPED + 1
1039  pskipped = pskipped + 1
1040  END IF
1041  ELSE
1042 * A(:,q) is zero column
1043  IF( ir1.EQ.0 )notrot = notrot + 1
1044  pskipped = pskipped + 1
1045  END IF
1046 *
1047  IF( ( i.LE.swband ) .AND.
1048  $ ( pskipped.GT.rowskip ) ) THEN
1049  IF( ir1.EQ.0 )aapp = -aapp
1050  notrot = 0
1051  GO TO 2103
1052  END IF
1053 *
1054  2002 CONTINUE
1055 * END q-LOOP
1056 *
1057  2103 CONTINUE
1058 * bailed out of q-loop
1059 *
1060  sva( p ) = aapp
1061 *
1062  ELSE
1063  sva( p ) = aapp
1064  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1065  $ notrot = notrot + min( igl+kbl-1, n ) - p
1066  END IF
1067 *
1068  2001 CONTINUE
1069 * end of the p-loop
1070 * end of doing the block ( ibr, ibr )
1071  1002 CONTINUE
1072 * end of ir1-loop
1073 *
1074 * ... go to the off diagonal blocks
1075 *
1076  igl = ( ibr-1 )*kbl + 1
1077 *
1078  DO 2010 jbc = ibr + 1, nbl
1079 *
1080  jgl = ( jbc-1 )*kbl + 1
1081 *
1082 * doing the block at ( ibr, jbc )
1083 *
1084  ijblsk = 0
1085  DO 2100 p = igl, min( igl+kbl-1, n )
1086 *
1087  aapp = sva( p )
1088  IF( aapp.GT.zero ) THEN
1089 *
1090  pskipped = 0
1091 *
1092  DO 2200 q = jgl, min( jgl+kbl-1, n )
1093 *
1094  aaqq = sva( q )
1095  IF( aaqq.GT.zero ) THEN
1096  aapp0 = aapp
1097 *
1098 * .. M x 2 Jacobi SVD ..
1099 *
1100 * Safe Gram matrix computation
1101 *
1102  IF( aaqq.GE.one ) THEN
1103  IF( aapp.GE.aaqq ) THEN
1104  rotok = ( small*aapp ).LE.aaqq
1105  ELSE
1106  rotok = ( small*aaqq ).LE.aapp
1107  END IF
1108  IF( aapp.LT.( big / aaqq ) ) THEN
1109  aapq = ( cdotc( m, a( 1, p ), 1,
1110  $ a( 1, q ), 1 ) / aaqq ) / aapp
1111  ELSE
1112  CALL ccopy( m, a( 1, p ), 1,
1113  $ cwork(n+1), 1 )
1114  CALL clascl( 'G', 0, 0, aapp,
1115  $ one, m, 1,
1116  $ cwork(n+1), lda, ierr )
1117  aapq = cdotc( m, cwork(n+1), 1,
1118  $ a( 1, q ), 1 ) / aaqq
1119  END IF
1120  ELSE
1121  IF( aapp.GE.aaqq ) THEN
1122  rotok = aapp.LE.( aaqq / small )
1123  ELSE
1124  rotok = aaqq.LE.( aapp / small )
1125  END IF
1126  IF( aapp.GT.( small / aaqq ) ) THEN
1127  aapq = ( cdotc( m, a( 1, p ), 1,
1128  $ a( 1, q ), 1 ) / max(aaqq,aapp) )
1129  $ / min(aaqq,aapp)
1130  ELSE
1131  CALL ccopy( m, a( 1, q ), 1,
1132  $ cwork(n+1), 1 )
1133  CALL clascl( 'G', 0, 0, aaqq,
1134  $ one, m, 1,
1135  $ cwork(n+1), lda, ierr )
1136  aapq = cdotc( m, a( 1, p ), 1,
1137  $ cwork(n+1), 1 ) / aapp
1138  END IF
1139  END IF
1140 *
1141 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1142  aapq1 = -abs(aapq)
1143  mxaapq = max( mxaapq, -aapq1 )
1144 *
1145 * TO rotate or NOT to rotate, THAT is the question ...
1146 *
1147  IF( abs( aapq1 ).GT.tol ) THEN
1148  ompq = aapq / abs(aapq)
1149  notrot = 0
1150 *[RTD] ROTATED = ROTATED + 1
1151  pskipped = 0
1152  iswrot = iswrot + 1
1153 *
1154  IF( rotok ) THEN
1155 *
1156  aqoap = aaqq / aapp
1157  apoaq = aapp / aaqq
1158  theta = -half*abs( aqoap-apoaq )/ aapq1
1159  IF( aaqq.GT.aapp0 )theta = -theta
1160 *
1161  IF( abs( theta ).GT.bigtheta ) THEN
1162  t = half / theta
1163  cs = one
1164  CALL crot( m, a(1,p), 1, a(1,q), 1,
1165  $ cs, conjg(ompq)*t )
1166  IF( rsvec ) THEN
1167  CALL crot( mvl, v(1,p), 1,
1168  $ v(1,q), 1, cs, conjg(ompq)*t )
1169  END IF
1170  sva( q ) = aaqq*sqrt( max( zero,
1171  $ one+t*apoaq*aapq1 ) )
1172  aapp = aapp*sqrt( max( zero,
1173  $ one-t*aqoap*aapq1 ) )
1174  mxsinj = max( mxsinj, abs( t ) )
1175  ELSE
1176 *
1177 * .. choose correct signum for THETA and rotate
1178 *
1179  thsign = -sign( one, aapq1 )
1180  IF( aaqq.GT.aapp0 )thsign = -thsign
1181  t = one / ( theta+thsign*
1182  $ sqrt( one+theta*theta ) )
1183  cs = sqrt( one / ( one+t*t ) )
1184  sn = t*cs
1185  mxsinj = max( mxsinj, abs( sn ) )
1186  sva( q ) = aaqq*sqrt( max( zero,
1187  $ one+t*apoaq*aapq1 ) )
1188  aapp = aapp*sqrt( max( zero,
1189  $ one-t*aqoap*aapq1 ) )
1190 *
1191  CALL crot( m, a(1,p), 1, a(1,q), 1,
1192  $ cs, conjg(ompq)*sn )
1193  IF( rsvec ) THEN
1194  CALL crot( mvl, v(1,p), 1,
1195  $ v(1,q), 1, cs, conjg(ompq)*sn )
1196  END IF
1197  END IF
1198  cwork(p) = -cwork(q) * ompq
1199 *
1200  ELSE
1201 * .. have to use modified Gram-Schmidt like transformation
1202  IF( aapp.GT.aaqq ) THEN
1203  CALL ccopy( m, a( 1, p ), 1,
1204  $ cwork(n+1), 1 )
1205  CALL clascl( 'G', 0, 0, aapp, one,
1206  $ m, 1, cwork(n+1),lda,
1207  $ ierr )
1208  CALL clascl( 'G', 0, 0, aaqq, one,
1209  $ m, 1, a( 1, q ), lda,
1210  $ ierr )
1211  CALL caxpy( m, -aapq, cwork(n+1),
1212  $ 1, a( 1, q ), 1 )
1213  CALL clascl( 'G', 0, 0, one, aaqq,
1214  $ m, 1, a( 1, q ), lda,
1215  $ ierr )
1216  sva( q ) = aaqq*sqrt( max( zero,
1217  $ one-aapq1*aapq1 ) )
1218  mxsinj = max( mxsinj, sfmin )
1219  ELSE
1220  CALL ccopy( m, a( 1, q ), 1,
1221  $ cwork(n+1), 1 )
1222  CALL clascl( 'G', 0, 0, aaqq, one,
1223  $ m, 1, cwork(n+1),lda,
1224  $ ierr )
1225  CALL clascl( 'G', 0, 0, aapp, one,
1226  $ m, 1, a( 1, p ), lda,
1227  $ ierr )
1228  CALL caxpy( m, -conjg(aapq),
1229  $ cwork(n+1), 1, a( 1, p ), 1 )
1230  CALL clascl( 'G', 0, 0, one, aapp,
1231  $ m, 1, a( 1, p ), lda,
1232  $ ierr )
1233  sva( p ) = aapp*sqrt( max( zero,
1234  $ one-aapq1*aapq1 ) )
1235  mxsinj = max( mxsinj, sfmin )
1236  END IF
1237  END IF
1238 * END IF ROTOK THEN ... ELSE
1239 *
1240 * In the case of cancellation in updating SVA(q), SVA(p)
1241 * .. recompute SVA(q), SVA(p)
1242  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1243  $ THEN
1244  IF( ( aaqq.LT.rootbig ) .AND.
1245  $ ( aaqq.GT.rootsfmin ) ) THEN
1246  sva( q ) = scnrm2( m, a( 1, q ), 1)
1247  ELSE
1248  t = zero
1249  aaqq = one
1250  CALL classq( m, a( 1, q ), 1, t,
1251  $ aaqq )
1252  sva( q ) = t*sqrt( aaqq )
1253  END IF
1254  END IF
1255  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1256  IF( ( aapp.LT.rootbig ) .AND.
1257  $ ( aapp.GT.rootsfmin ) ) THEN
1258  aapp = scnrm2( m, a( 1, p ), 1 )
1259  ELSE
1260  t = zero
1261  aapp = one
1262  CALL classq( m, a( 1, p ), 1, t,
1263  $ aapp )
1264  aapp = t*sqrt( aapp )
1265  END IF
1266  sva( p ) = aapp
1267  END IF
1268 * end of OK rotation
1269  ELSE
1270  notrot = notrot + 1
1271 *[RTD] SKIPPED = SKIPPED + 1
1272  pskipped = pskipped + 1
1273  ijblsk = ijblsk + 1
1274  END IF
1275  ELSE
1276  notrot = notrot + 1
1277  pskipped = pskipped + 1
1278  ijblsk = ijblsk + 1
1279  END IF
1280 *
1281  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1282  $ THEN
1283  sva( p ) = aapp
1284  notrot = 0
1285  GO TO 2011
1286  END IF
1287  IF( ( i.LE.swband ) .AND.
1288  $ ( pskipped.GT.rowskip ) ) THEN
1289  aapp = -aapp
1290  notrot = 0
1291  GO TO 2203
1292  END IF
1293 *
1294  2200 CONTINUE
1295 * end of the q-loop
1296  2203 CONTINUE
1297 *
1298  sva( p ) = aapp
1299 *
1300  ELSE
1301 *
1302  IF( aapp.EQ.zero )notrot = notrot +
1303  $ min( jgl+kbl-1, n ) - jgl + 1
1304  IF( aapp.LT.zero )notrot = 0
1305 *
1306  END IF
1307 *
1308  2100 CONTINUE
1309 * end of the p-loop
1310  2010 CONTINUE
1311 * end of the jbc-loop
1312  2011 CONTINUE
1313 *2011 bailed out of the jbc-loop
1314  DO 2012 p = igl, min( igl+kbl-1, n )
1315  sva( p ) = abs( sva( p ) )
1316  2012 CONTINUE
1317 ***
1318  2000 CONTINUE
1319 *2000 :: end of the ibr-loop
1320 *
1321 * .. update SVA(N)
1322  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1323  $ THEN
1324  sva( n ) = scnrm2( m, a( 1, n ), 1 )
1325  ELSE
1326  t = zero
1327  aapp = one
1328  CALL classq( m, a( 1, n ), 1, t, aapp )
1329  sva( n ) = t*sqrt( aapp )
1330  END IF
1331 *
1332 * Additional steering devices
1333 *
1334  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1335  $ ( iswrot.LE.n ) ) )swband = i
1336 *
1337  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( REAL( N ) )*
1338  $ tol ) .AND. ( REAL( n )*mxaapq*mxsinj.LT.tol ) ) then
1339  GO TO 1994
1340  END IF
1341 *
1342  IF( notrot.GE.emptsw )GO TO 1994
1343 *
1344  1993 CONTINUE
1345 * end i=1:NSWEEP loop
1346 *
1347 * #:( Reaching this point means that the procedure has not converged.
1348  info = nsweep - 1
1349  GO TO 1995
1350 *
1351  1994 CONTINUE
1352 * #:) Reaching this point means numerical convergence after the i-th
1353 * sweep.
1354 *
1355  info = 0
1356 * #:) INFO = 0 confirms successful iterations.
1357  1995 CONTINUE
1358 *
1359 * Sort the singular values and find how many are above
1360 * the underflow threshold.
1361 *
1362  n2 = 0
1363  n4 = 0
1364  DO 5991 p = 1, n - 1
1365  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1366  IF( p.NE.q ) THEN
1367  temp1 = sva( p )
1368  sva( p ) = sva( q )
1369  sva( q ) = temp1
1370  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1371  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1372  END IF
1373  IF( sva( p ).NE.zero ) THEN
1374  n4 = n4 + 1
1375  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1376  END IF
1377  5991 CONTINUE
1378  IF( sva( n ).NE.zero ) THEN
1379  n4 = n4 + 1
1380  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1381  END IF
1382 *
1383 * Normalize the left singular vectors.
1384 *
1385  IF( lsvec .OR. uctol ) THEN
1386  DO 1998 p = 1, n4
1387 * CALL CSSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1388  CALL clascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1389  1998 CONTINUE
1390  END IF
1391 *
1392 * Scale the product of Jacobi rotations.
1393 *
1394  IF( rsvec ) THEN
1395  DO 2399 p = 1, n
1396  temp1 = one / scnrm2( mvl, v( 1, p ), 1 )
1397  CALL csscal( mvl, temp1, v( 1, p ), 1 )
1398  2399 CONTINUE
1399  END IF
1400 *
1401 * Undo scaling, if necessary (and possible).
1402  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1403  $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1404  $ ( sfmin / skl ) ) ) ) THEN
1405  DO 2400 p = 1, n
1406  sva( p ) = skl*sva( p )
1407  2400 CONTINUE
1408  skl = one
1409  END IF
1410 *
1411  rwork( 1 ) = skl
1412 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1413 * then some of the singular values may overflow or underflow and
1414 * the spectrum is given in this factored representation.
1415 *
1416  rwork( 2 ) = REAL( n4 )
1417 * N4 is the number of computed nonzero singular values of A.
1418 *
1419  rwork( 3 ) = REAL( n2 )
1420 * N2 is the number of singular values of A greater than SFMIN.
1421 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1422 * that may carry some information.
1423 *
1424  rwork( 4 ) = REAL( i )
1425 * i is the index of the last sweep before declaring convergence.
1426 *
1427  rwork( 5 ) = mxaapq
1428 * MXAAPQ is the largest absolute value of scaled pivots in the
1429 * last sweep
1430 *
1431  rwork( 6 ) = mxsinj
1432 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1433 * in the last sweep
1434 *
1435  RETURN
1436 * ..
1437 * .. END OF CGESVJ
1438 * ..
real function scnrm2(N, X, INCX)
SCNRM2
Definition: scnrm2.f:56
subroutine cgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ0 pre-processor for the routine cgesvj.
Definition: cgsvj0.f:220
subroutine classq(N, X, INCX, SCALE, SUMSQ)
CLASSQ updates a sum of squares represented in scaled form.
Definition: classq.f:108
subroutine crot(N, CX, INCX, CY, INCY, C, S)
CROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors...
Definition: crot.f:105
subroutine claset(UPLO, M, N, ALPHA, BETA, A, LDA)
CLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: claset.f:108
subroutine clascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
CLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: clascl.f:145
integer function isamax(N, SX, INCX)
ISAMAX
Definition: isamax.f:53
complex function cdotc(N, CX, INCX, CY, INCY)
CDOTC
Definition: cdotc.f:54
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:145
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
real function slamch(CMACH)
SLAMCH
Definition: slamch.f:69
subroutine ccopy(N, CX, INCX, CY, INCY)
CCOPY
Definition: ccopy.f:52
subroutine cswap(N, CX, INCX, CY, INCY)
CSWAP
Definition: cswap.f:52
subroutine caxpy(N, CA, CX, INCX, CY, INCY)
CAXPY
Definition: caxpy.f:53
subroutine csscal(N, SA, CX, INCX)
CSSCAL
Definition: csscal.f:54
subroutine cgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ1 pre-processor for the routine cgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: cgsvj1.f:238

Here is the call graph for this function:

Here is the caller graph for this function: