LAPACK  3.7.0
LAPACK: Linear Algebra PACKage
subroutine zgesvj ( character*1  JOBA,
character*1  JOBU,
character*1  JOBV,
integer  M,
integer  N,
complex*16, dimension( lda, * )  A,
integer  LDA,
double precision, dimension( n )  SVA,
integer  MV,
complex*16, dimension( ldv, * )  V,
integer  LDV,
complex*16, dimension( lwork )  CWORK,
integer  LWORK,
double precision, dimension( lrwork )  RWORK,
integer  LRWORK,
integer  INFO 
)

ZGESVJ

Download ZGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 ZGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER* 1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U' or 'F': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=DLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' or 'J': the matrix V is computed and returned in the array V
          = 'A' : the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly; instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N' : the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/DLAMCH('E') > M >= 0.
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX*16 array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C':
                 If INFO .EQ. 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold DLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU.EQ.'C'),
                 see the description of JOBU.
                 If INFO .GT. 0,
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU .EQ. 'N':
                 If INFO .EQ. 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO .GT. 0 :
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is DOUBLE PRECISION array, dimension (N)
          On exit,
          If INFO .EQ. 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE .EQ. ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO .GT. 0 :
          the procedure ZGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV .EQ. 'A', then the product of Jacobi rotations in ZGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX*16 array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV .GE. 1.
          If JOBV .EQ. 'V', then LDV .GE. max(1,N).
          If JOBV .EQ. 'A', then LDV .GE. max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX*16 array, dimension max(1,LWORK).
          Used as workspace.
          If on entry LWORK .EQ. -1, then a workspace query is assumed and
          no computation is done; CWORK(1) is set to the minial (and optimal)
          length of CWORK.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is DOUBLE PRECISION array, dimension max(6,LRWORK).
          On entry,
          If JOBU .EQ. 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=DLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when ZGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is stil useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
         If on entry LRWORK .EQ. -1, then a workspace query is assumed and
         no computation is done; RWORK(1) is set to the minial (and optimal)
         length of RWORK.
[in]LRWORK
         LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0 : successful exit.
          < 0 : if INFO = -i, then the i-th argument had an illegal value
          > 0 : ZGESVJ did not converge in the maximal allowed number
                (NSWEEP=30) of sweeps. The output may still be useful.
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Date
June 2016
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tunning parameters (marked with [TP]) are available for the
 implementer.
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributor:
  ============

  Zlatko Drmac (Zagreb, Croatia)
References:
[1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer. SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic. SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. LAPACK Working note 169. [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. LAPACK Working note 170. [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations. Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  [email protected]. Thank you.

Definition at line 351 of file zgesvj.f.

351 *
352 * -- LAPACK computational routine (version 3.7.0) --
353 * -- LAPACK is a software package provided by Univ. of Tennessee, --
354 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355 * June 2016
356 *
357  IMPLICIT NONE
358 * .. Scalar Arguments ..
359  INTEGER info, lda, ldv, lwork, lrwork, m, mv, n
360  CHARACTER*1 joba, jobu, jobv
361 * ..
362 * .. Array Arguments ..
363  COMPLEX*16 a( lda, * ), v( ldv, * ), cwork( lwork )
364  DOUBLE PRECISION rwork( lrwork ), sva( n )
365 * ..
366 *
367 * =====================================================================
368 *
369 * .. Local Parameters ..
370  DOUBLE PRECISION zero, half, one
371  parameter( zero = 0.0d0, half = 0.5d0, one = 1.0d0)
372  COMPLEX*16 czero, cone
373  parameter( czero = (0.0d0, 0.0d0), cone = (1.0d0, 0.0d0) )
374  INTEGER nsweep
375  parameter( nsweep = 30 )
376 * ..
377 * .. Local Scalars ..
378  COMPLEX*16 aapq, ompq
379  DOUBLE PRECISION aapp, aapp0, aapq1, aaqq, apoaq, aqoap, big,
380  $ bigtheta, cs, ctol, epsln, mxaapq,
381  $ mxsinj, rootbig, rooteps, rootsfmin, roottol,
382  $ skl, sfmin, small, sn, t, temp1, theta, thsign, tol
383  INTEGER blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
384  $ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
385  $ n4, nbl, notrot, p, pskipped, q, rowskip, swband
386  LOGICAL applv, goscale, lower, lquery, lsvec, noscale, rotok,
387  $ rsvec, uctol, upper
388 * ..
389 * ..
390 * .. Intrinsic Functions ..
391  INTRINSIC abs, max, min, conjg, dble, sign, sqrt
392 * ..
393 * .. External Functions ..
394 * ..
395 * from BLAS
396  DOUBLE PRECISION dznrm2
397  COMPLEX*16 zdotc
398  EXTERNAL zdotc, dznrm2
399  INTEGER idamax
400  EXTERNAL idamax
401 * from LAPACK
402  DOUBLE PRECISION dlamch
403  EXTERNAL dlamch
404  LOGICAL lsame
405  EXTERNAL lsame
406 * ..
407 * .. External Subroutines ..
408 * ..
409 * from BLAS
410  EXTERNAL zcopy, zrot, zdscal, zswap
411 * from LAPACK
412  EXTERNAL dlascl, zlascl, zlaset, zlassq, xerbla
413  EXTERNAL zgsvj0, zgsvj1
414 * ..
415 * .. Executable Statements ..
416 *
417 * Test the input arguments
418 *
419  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
420  uctol = lsame( jobu, 'C' )
421  rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
422  applv = lsame( jobv, 'A' )
423  upper = lsame( joba, 'U' )
424  lower = lsame( joba, 'L' )
425 *
426  lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
427  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
428  info = -1
429  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
430  info = -2
431  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
432  info = -3
433  ELSE IF( m.LT.0 ) THEN
434  info = -4
435  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
436  info = -5
437  ELSE IF( lda.LT.m ) THEN
438  info = -7
439  ELSE IF( mv.LT.0 ) THEN
440  info = -9
441  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
442  $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
443  info = -11
444  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
445  info = -12
446  ELSE IF( ( lwork.LT.( m+n ) ) .AND. ( .NOT.lquery ) ) THEN
447  info = -13
448  ELSE IF( ( lrwork.LT.max( n, 6 ) ) .AND. ( .NOT.lquery ) ) THEN
449  info = -15
450  ELSE
451  info = 0
452  END IF
453 *
454 * #:(
455  IF( info.NE.0 ) THEN
456  CALL xerbla( 'ZGESVJ', -info )
457  RETURN
458  ELSE IF ( lquery ) THEN
459  cwork(1) = m + n
460  rwork(1) = max( n, 6 )
461  RETURN
462  END IF
463 *
464 * #:) Quick return for void matrix
465 *
466  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
467 *
468 * Set numerical parameters
469 * The stopping criterion for Jacobi rotations is
470 *
471 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
472 *
473 * where EPS is the round-off and CTOL is defined as follows:
474 *
475  IF( uctol ) THEN
476 * ... user controlled
477  ctol = rwork( 1 )
478  ELSE
479 * ... default
480  IF( lsvec .OR. rsvec .OR. applv ) THEN
481  ctol = sqrt( dble( m ) )
482  ELSE
483  ctol = dble( m )
484  END IF
485  END IF
486 * ... and the machine dependent parameters are
487 *[!] (Make sure that SLAMCH() works properly on the target machine.)
488 *
489  epsln = dlamch( 'Epsilon' )
490  rooteps = sqrt( epsln )
491  sfmin = dlamch( 'SafeMinimum' )
492  rootsfmin = sqrt( sfmin )
493  small = sfmin / epsln
494  big = dlamch( 'Overflow' )
495 * BIG = ONE / SFMIN
496  rootbig = one / rootsfmin
497 * LARGE = BIG / SQRT( DBLE( M*N ) )
498  bigtheta = one / rooteps
499 *
500  tol = ctol*epsln
501  roottol = sqrt( tol )
502 *
503  IF( dble( m )*epsln.GE.one ) THEN
504  info = -4
505  CALL xerbla( 'ZGESVJ', -info )
506  RETURN
507  END IF
508 *
509 * Initialize the right singular vector matrix.
510 *
511  IF( rsvec ) THEN
512  mvl = n
513  CALL zlaset( 'A', mvl, n, czero, cone, v, ldv )
514  ELSE IF( applv ) THEN
515  mvl = mv
516  END IF
517  rsvec = rsvec .OR. applv
518 *
519 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
520 *(!) If necessary, scale A to protect the largest singular value
521 * from overflow. It is possible that saving the largest singular
522 * value destroys the information about the small ones.
523 * This initial scaling is almost minimal in the sense that the
524 * goal is to make sure that no column norm overflows, and that
525 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
526 * in A are detected, the procedure returns with INFO=-6.
527 *
528  skl = one / sqrt( dble( m )*dble( n ) )
529  noscale = .true.
530  goscale = .true.
531 *
532  IF( lower ) THEN
533 * the input matrix is M-by-N lower triangular (trapezoidal)
534  DO 1874 p = 1, n
535  aapp = zero
536  aaqq = one
537  CALL zlassq( m-p+1, a( p, p ), 1, aapp, aaqq )
538  IF( aapp.GT.big ) THEN
539  info = -6
540  CALL xerbla( 'ZGESVJ', -info )
541  RETURN
542  END IF
543  aaqq = sqrt( aaqq )
544  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
545  sva( p ) = aapp*aaqq
546  ELSE
547  noscale = .false.
548  sva( p ) = aapp*( aaqq*skl )
549  IF( goscale ) THEN
550  goscale = .false.
551  DO 1873 q = 1, p - 1
552  sva( q ) = sva( q )*skl
553  1873 CONTINUE
554  END IF
555  END IF
556  1874 CONTINUE
557  ELSE IF( upper ) THEN
558 * the input matrix is M-by-N upper triangular (trapezoidal)
559  DO 2874 p = 1, n
560  aapp = zero
561  aaqq = one
562  CALL zlassq( p, a( 1, p ), 1, aapp, aaqq )
563  IF( aapp.GT.big ) THEN
564  info = -6
565  CALL xerbla( 'ZGESVJ', -info )
566  RETURN
567  END IF
568  aaqq = sqrt( aaqq )
569  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
570  sva( p ) = aapp*aaqq
571  ELSE
572  noscale = .false.
573  sva( p ) = aapp*( aaqq*skl )
574  IF( goscale ) THEN
575  goscale = .false.
576  DO 2873 q = 1, p - 1
577  sva( q ) = sva( q )*skl
578  2873 CONTINUE
579  END IF
580  END IF
581  2874 CONTINUE
582  ELSE
583 * the input matrix is M-by-N general dense
584  DO 3874 p = 1, n
585  aapp = zero
586  aaqq = one
587  CALL zlassq( m, a( 1, p ), 1, aapp, aaqq )
588  IF( aapp.GT.big ) THEN
589  info = -6
590  CALL xerbla( 'ZGESVJ', -info )
591  RETURN
592  END IF
593  aaqq = sqrt( aaqq )
594  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
595  sva( p ) = aapp*aaqq
596  ELSE
597  noscale = .false.
598  sva( p ) = aapp*( aaqq*skl )
599  IF( goscale ) THEN
600  goscale = .false.
601  DO 3873 q = 1, p - 1
602  sva( q ) = sva( q )*skl
603  3873 CONTINUE
604  END IF
605  END IF
606  3874 CONTINUE
607  END IF
608 *
609  IF( noscale )skl = one
610 *
611 * Move the smaller part of the spectrum from the underflow threshold
612 *(!) Start by determining the position of the nonzero entries of the
613 * array SVA() relative to ( SFMIN, BIG ).
614 *
615  aapp = zero
616  aaqq = big
617  DO 4781 p = 1, n
618  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
619  aapp = max( aapp, sva( p ) )
620  4781 CONTINUE
621 *
622 * #:) Quick return for zero matrix
623 *
624  IF( aapp.EQ.zero ) THEN
625  IF( lsvec )CALL zlaset( 'G', m, n, czero, cone, a, lda )
626  rwork( 1 ) = one
627  rwork( 2 ) = zero
628  rwork( 3 ) = zero
629  rwork( 4 ) = zero
630  rwork( 5 ) = zero
631  rwork( 6 ) = zero
632  RETURN
633  END IF
634 *
635 * #:) Quick return for one-column matrix
636 *
637  IF( n.EQ.1 ) THEN
638  IF( lsvec )CALL zlascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
639  $ a( 1, 1 ), lda, ierr )
640  rwork( 1 ) = one / skl
641  IF( sva( 1 ).GE.sfmin ) THEN
642  rwork( 2 ) = one
643  ELSE
644  rwork( 2 ) = zero
645  END IF
646  rwork( 3 ) = zero
647  rwork( 4 ) = zero
648  rwork( 5 ) = zero
649  rwork( 6 ) = zero
650  RETURN
651  END IF
652 *
653 * Protect small singular values from underflow, and try to
654 * avoid underflows/overflows in computing Jacobi rotations.
655 *
656  sn = sqrt( sfmin / epsln )
657  temp1 = sqrt( big / dble( n ) )
658  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
659  $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
660  temp1 = min( big, temp1 / aapp )
661 * AAQQ = AAQQ*TEMP1
662 * AAPP = AAPP*TEMP1
663  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
664  temp1 = min( sn / aaqq, big / (aapp*sqrt( dble(n)) ) )
665 * AAQQ = AAQQ*TEMP1
666 * AAPP = AAPP*TEMP1
667  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
668  temp1 = max( sn / aaqq, temp1 / aapp )
669 * AAQQ = AAQQ*TEMP1
670 * AAPP = AAPP*TEMP1
671  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
672  temp1 = min( sn / aaqq, big / ( sqrt( dble( n ) )*aapp ) )
673 * AAQQ = AAQQ*TEMP1
674 * AAPP = AAPP*TEMP1
675  ELSE
676  temp1 = one
677  END IF
678 *
679 * Scale, if necessary
680 *
681  IF( temp1.NE.one ) THEN
682  CALL dlascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
683  END IF
684  skl = temp1*skl
685  IF( skl.NE.one ) THEN
686  CALL zlascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
687  skl = one / skl
688  END IF
689 *
690 * Row-cyclic Jacobi SVD algorithm with column pivoting
691 *
692  emptsw = ( n*( n-1 ) ) / 2
693  notrot = 0
694 
695  DO 1868 q = 1, n
696  cwork( q ) = cone
697  1868 CONTINUE
698 *
699 *
700 *
701  swband = 3
702 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
703 * if ZGESVJ is used as a computational routine in the preconditioned
704 * Jacobi SVD algorithm ZGEJSV. For sweeps i=1:SWBAND the procedure
705 * works on pivots inside a band-like region around the diagonal.
706 * The boundaries are determined dynamically, based on the number of
707 * pivots above a threshold.
708 *
709  kbl = min( 8, n )
710 *[TP] KBL is a tuning parameter that defines the tile size in the
711 * tiling of the p-q loops of pivot pairs. In general, an optimal
712 * value of KBL depends on the matrix dimensions and on the
713 * parameters of the computer's memory.
714 *
715  nbl = n / kbl
716  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
717 *
718  blskip = kbl**2
719 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
720 *
721  rowskip = min( 5, kbl )
722 *[TP] ROWSKIP is a tuning parameter.
723 *
724  lkahead = 1
725 *[TP] LKAHEAD is a tuning parameter.
726 *
727 * Quasi block transformations, using the lower (upper) triangular
728 * structure of the input matrix. The quasi-block-cycling usually
729 * invokes cubic convergence. Big part of this cycle is done inside
730 * canonical subspaces of dimensions less than M.
731 *
732  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
733 *[TP] The number of partition levels and the actual partition are
734 * tuning parameters.
735  n4 = n / 4
736  n2 = n / 2
737  n34 = 3*n4
738  IF( applv ) THEN
739  q = 0
740  ELSE
741  q = 1
742  END IF
743 *
744  IF( lower ) THEN
745 *
746 * This works very well on lower triangular matrices, in particular
747 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
748 * The idea is simple:
749 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
750 * [+ + 0 0] [0 0]
751 * [+ + x 0] actually work on [x 0] [x 0]
752 * [+ + x x] [x x]. [x x]
753 *
754  CALL zgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
755  $ cwork( n34+1 ), sva( n34+1 ), mvl,
756  $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
757  $ 2, cwork( n+1 ), lwork-n, ierr )
758 
759  CALL zgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
760  $ cwork( n2+1 ), sva( n2+1 ), mvl,
761  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
762  $ cwork( n+1 ), lwork-n, ierr )
763 
764  CALL zgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
765  $ cwork( n2+1 ), sva( n2+1 ), mvl,
766  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
767  $ cwork( n+1 ), lwork-n, ierr )
768 
769  CALL zgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
770  $ cwork( n4+1 ), sva( n4+1 ), mvl,
771  $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
772  $ cwork( n+1 ), lwork-n, ierr )
773 *
774  CALL zgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
775  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
776  $ ierr )
777 *
778  CALL zgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
779  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
780  $ lwork-n, ierr )
781 *
782 *
783  ELSE IF( upper ) THEN
784 *
785 *
786  CALL zgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
787  $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
788  $ ierr )
789 *
790  CALL zgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
791  $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
792  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
793  $ ierr )
794 *
795  CALL zgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
796  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
797  $ lwork-n, ierr )
798 *
799  CALL zgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
800  $ cwork( n2+1 ), sva( n2+1 ), mvl,
801  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
802  $ cwork( n+1 ), lwork-n, ierr )
803 
804  END IF
805 *
806  END IF
807 *
808 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
809 *
810  DO 1993 i = 1, nsweep
811 *
812 * .. go go go ...
813 *
814  mxaapq = zero
815  mxsinj = zero
816  iswrot = 0
817 *
818  notrot = 0
819  pskipped = 0
820 *
821 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
822 * 1 <= p < q <= N. This is the first step toward a blocked implementation
823 * of the rotations. New implementation, based on block transformations,
824 * is under development.
825 *
826  DO 2000 ibr = 1, nbl
827 *
828  igl = ( ibr-1 )*kbl + 1
829 *
830  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
831 *
832  igl = igl + ir1*kbl
833 *
834  DO 2001 p = igl, min( igl+kbl-1, n-1 )
835 *
836 * .. de Rijk's pivoting
837 *
838  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
839  IF( p.NE.q ) THEN
840  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
841  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1,
842  $ v( 1, q ), 1 )
843  temp1 = sva( p )
844  sva( p ) = sva( q )
845  sva( q ) = temp1
846  aapq = cwork(p)
847  cwork(p) = cwork(q)
848  cwork(q) = aapq
849  END IF
850 *
851  IF( ir1.EQ.0 ) THEN
852 *
853 * Column norms are periodically updated by explicit
854 * norm computation.
855 *[!] Caveat:
856 * Unfortunately, some BLAS implementations compute DZNRM2(M,A(1,p),1)
857 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
858 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
859 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
860 * Hence, DZNRM2 cannot be trusted, not even in the case when
861 * the true norm is far from the under(over)flow boundaries.
862 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
863 * below should be replaced with "AAPP = DZNRM2( M, A(1,p), 1 )".
864 *
865  IF( ( sva( p ).LT.rootbig ) .AND.
866  $ ( sva( p ).GT.rootsfmin ) ) THEN
867  sva( p ) = dznrm2( m, a( 1, p ), 1 )
868  ELSE
869  temp1 = zero
870  aapp = one
871  CALL zlassq( m, a( 1, p ), 1, temp1, aapp )
872  sva( p ) = temp1*sqrt( aapp )
873  END IF
874  aapp = sva( p )
875  ELSE
876  aapp = sva( p )
877  END IF
878 *
879  IF( aapp.GT.zero ) THEN
880 *
881  pskipped = 0
882 *
883  DO 2002 q = p + 1, min( igl+kbl-1, n )
884 *
885  aaqq = sva( q )
886 *
887  IF( aaqq.GT.zero ) THEN
888 *
889  aapp0 = aapp
890  IF( aaqq.GE.one ) THEN
891  rotok = ( small*aapp ).LE.aaqq
892  IF( aapp.LT.( big / aaqq ) ) THEN
893  aapq = ( zdotc( m, a( 1, p ), 1,
894  $ a( 1, q ), 1 ) / aaqq ) / aapp
895  ELSE
896  CALL zcopy( m, a( 1, p ), 1,
897  $ cwork(n+1), 1 )
898  CALL zlascl( 'G', 0, 0, aapp, one,
899  $ m, 1, cwork(n+1), lda, ierr )
900  aapq = zdotc( m, cwork(n+1), 1,
901  $ a( 1, q ), 1 ) / aaqq
902  END IF
903  ELSE
904  rotok = aapp.LE.( aaqq / small )
905  IF( aapp.GT.( small / aaqq ) ) THEN
906  aapq = ( zdotc( m, a( 1, p ), 1,
907  $ a( 1, q ), 1 ) / aapp ) / aaqq
908  ELSE
909  CALL zcopy( m, a( 1, q ), 1,
910  $ cwork(n+1), 1 )
911  CALL zlascl( 'G', 0, 0, aaqq,
912  $ one, m, 1,
913  $ cwork(n+1), lda, ierr )
914  aapq = zdotc( m, a(1, p ), 1,
915  $ cwork(n+1), 1 ) / aapp
916  END IF
917  END IF
918 *
919 
920 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
921  aapq1 = -abs(aapq)
922  mxaapq = max( mxaapq, -aapq1 )
923 *
924 * TO rotate or NOT to rotate, THAT is the question ...
925 *
926  IF( abs( aapq1 ).GT.tol ) THEN
927  ompq = aapq / abs(aapq)
928 *
929 * .. rotate
930 *[RTD] ROTATED = ROTATED + ONE
931 *
932  IF( ir1.EQ.0 ) THEN
933  notrot = 0
934  pskipped = 0
935  iswrot = iswrot + 1
936  END IF
937 *
938  IF( rotok ) THEN
939 *
940  aqoap = aaqq / aapp
941  apoaq = aapp / aaqq
942  theta = -half*abs( aqoap-apoaq )/aapq1
943 *
944  IF( abs( theta ).GT.bigtheta ) THEN
945 *
946  t = half / theta
947  cs = one
948 
949  CALL zrot( m, a(1,p), 1, a(1,q), 1,
950  $ cs, conjg(ompq)*t )
951  IF ( rsvec ) THEN
952  CALL zrot( mvl, v(1,p), 1,
953  $ v(1,q), 1, cs, conjg(ompq)*t )
954  END IF
955 
956  sva( q ) = aaqq*sqrt( max( zero,
957  $ one+t*apoaq*aapq1 ) )
958  aapp = aapp*sqrt( max( zero,
959  $ one-t*aqoap*aapq1 ) )
960  mxsinj = max( mxsinj, abs( t ) )
961 *
962  ELSE
963 *
964 * .. choose correct signum for THETA and rotate
965 *
966  thsign = -sign( one, aapq1 )
967  t = one / ( theta+thsign*
968  $ sqrt( one+theta*theta ) )
969  cs = sqrt( one / ( one+t*t ) )
970  sn = t*cs
971 *
972  mxsinj = max( mxsinj, abs( sn ) )
973  sva( q ) = aaqq*sqrt( max( zero,
974  $ one+t*apoaq*aapq1 ) )
975  aapp = aapp*sqrt( max( zero,
976  $ one-t*aqoap*aapq1 ) )
977 *
978  CALL zrot( m, a(1,p), 1, a(1,q), 1,
979  $ cs, conjg(ompq)*sn )
980  IF ( rsvec ) THEN
981  CALL zrot( mvl, v(1,p), 1,
982  $ v(1,q), 1, cs, conjg(ompq)*sn )
983  END IF
984  END IF
985  cwork(p) = -cwork(q) * ompq
986 *
987  ELSE
988 * .. have to use modified Gram-Schmidt like transformation
989  CALL zcopy( m, a( 1, p ), 1,
990  $ cwork(n+1), 1 )
991  CALL zlascl( 'G', 0, 0, aapp, one, m,
992  $ 1, cwork(n+1), lda,
993  $ ierr )
994  CALL zlascl( 'G', 0, 0, aaqq, one, m,
995  $ 1, a( 1, q ), lda, ierr )
996  CALL zaxpy( m, -aapq, cwork(n+1), 1,
997  $ a( 1, q ), 1 )
998  CALL zlascl( 'G', 0, 0, one, aaqq, m,
999  $ 1, a( 1, q ), lda, ierr )
1000  sva( q ) = aaqq*sqrt( max( zero,
1001  $ one-aapq1*aapq1 ) )
1002  mxsinj = max( mxsinj, sfmin )
1003  END IF
1004 * END IF ROTOK THEN ... ELSE
1005 *
1006 * In the case of cancellation in updating SVA(q), SVA(p)
1007 * recompute SVA(q), SVA(p).
1008 *
1009  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1010  $ THEN
1011  IF( ( aaqq.LT.rootbig ) .AND.
1012  $ ( aaqq.GT.rootsfmin ) ) THEN
1013  sva( q ) = dznrm2( m, a( 1, q ), 1 )
1014  ELSE
1015  t = zero
1016  aaqq = one
1017  CALL zlassq( m, a( 1, q ), 1, t,
1018  $ aaqq )
1019  sva( q ) = t*sqrt( aaqq )
1020  END IF
1021  END IF
1022  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1023  IF( ( aapp.LT.rootbig ) .AND.
1024  $ ( aapp.GT.rootsfmin ) ) THEN
1025  aapp = dznrm2( m, a( 1, p ), 1 )
1026  ELSE
1027  t = zero
1028  aapp = one
1029  CALL zlassq( m, a( 1, p ), 1, t,
1030  $ aapp )
1031  aapp = t*sqrt( aapp )
1032  END IF
1033  sva( p ) = aapp
1034  END IF
1035 *
1036  ELSE
1037 * A(:,p) and A(:,q) already numerically orthogonal
1038  IF( ir1.EQ.0 )notrot = notrot + 1
1039 *[RTD] SKIPPED = SKIPPED + 1
1040  pskipped = pskipped + 1
1041  END IF
1042  ELSE
1043 * A(:,q) is zero column
1044  IF( ir1.EQ.0 )notrot = notrot + 1
1045  pskipped = pskipped + 1
1046  END IF
1047 *
1048  IF( ( i.LE.swband ) .AND.
1049  $ ( pskipped.GT.rowskip ) ) THEN
1050  IF( ir1.EQ.0 )aapp = -aapp
1051  notrot = 0
1052  GO TO 2103
1053  END IF
1054 *
1055  2002 CONTINUE
1056 * END q-LOOP
1057 *
1058  2103 CONTINUE
1059 * bailed out of q-loop
1060 *
1061  sva( p ) = aapp
1062 *
1063  ELSE
1064  sva( p ) = aapp
1065  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1066  $ notrot = notrot + min( igl+kbl-1, n ) - p
1067  END IF
1068 *
1069  2001 CONTINUE
1070 * end of the p-loop
1071 * end of doing the block ( ibr, ibr )
1072  1002 CONTINUE
1073 * end of ir1-loop
1074 *
1075 * ... go to the off diagonal blocks
1076 *
1077  igl = ( ibr-1 )*kbl + 1
1078 *
1079  DO 2010 jbc = ibr + 1, nbl
1080 *
1081  jgl = ( jbc-1 )*kbl + 1
1082 *
1083 * doing the block at ( ibr, jbc )
1084 *
1085  ijblsk = 0
1086  DO 2100 p = igl, min( igl+kbl-1, n )
1087 *
1088  aapp = sva( p )
1089  IF( aapp.GT.zero ) THEN
1090 *
1091  pskipped = 0
1092 *
1093  DO 2200 q = jgl, min( jgl+kbl-1, n )
1094 *
1095  aaqq = sva( q )
1096  IF( aaqq.GT.zero ) THEN
1097  aapp0 = aapp
1098 *
1099 * .. M x 2 Jacobi SVD ..
1100 *
1101 * Safe Gram matrix computation
1102 *
1103  IF( aaqq.GE.one ) THEN
1104  IF( aapp.GE.aaqq ) THEN
1105  rotok = ( small*aapp ).LE.aaqq
1106  ELSE
1107  rotok = ( small*aaqq ).LE.aapp
1108  END IF
1109  IF( aapp.LT.( big / aaqq ) ) THEN
1110  aapq = ( zdotc( m, a( 1, p ), 1,
1111  $ a( 1, q ), 1 ) / aaqq ) / aapp
1112  ELSE
1113  CALL zcopy( m, a( 1, p ), 1,
1114  $ cwork(n+1), 1 )
1115  CALL zlascl( 'G', 0, 0, aapp,
1116  $ one, m, 1,
1117  $ cwork(n+1), lda, ierr )
1118  aapq = zdotc( m, cwork(n+1), 1,
1119  $ a( 1, q ), 1 ) / aaqq
1120  END IF
1121  ELSE
1122  IF( aapp.GE.aaqq ) THEN
1123  rotok = aapp.LE.( aaqq / small )
1124  ELSE
1125  rotok = aaqq.LE.( aapp / small )
1126  END IF
1127  IF( aapp.GT.( small / aaqq ) ) THEN
1128  aapq = ( zdotc( m, a( 1, p ), 1,
1129  $ a( 1, q ), 1 ) / max(aaqq,aapp) )
1130  $ / min(aaqq,aapp)
1131  ELSE
1132  CALL zcopy( m, a( 1, q ), 1,
1133  $ cwork(n+1), 1 )
1134  CALL zlascl( 'G', 0, 0, aaqq,
1135  $ one, m, 1,
1136  $ cwork(n+1), lda, ierr )
1137  aapq = zdotc( m, a( 1, p ), 1,
1138  $ cwork(n+1), 1 ) / aapp
1139  END IF
1140  END IF
1141 *
1142 
1143 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1144  aapq1 = -abs(aapq)
1145  mxaapq = max( mxaapq, -aapq1 )
1146 *
1147 * TO rotate or NOT to rotate, THAT is the question ...
1148 *
1149  IF( abs( aapq1 ).GT.tol ) THEN
1150  ompq = aapq / abs(aapq)
1151  notrot = 0
1152 *[RTD] ROTATED = ROTATED + 1
1153  pskipped = 0
1154  iswrot = iswrot + 1
1155 *
1156  IF( rotok ) THEN
1157 *
1158  aqoap = aaqq / aapp
1159  apoaq = aapp / aaqq
1160  theta = -half*abs( aqoap-apoaq )/ aapq1
1161  IF( aaqq.GT.aapp0 )theta = -theta
1162 *
1163  IF( abs( theta ).GT.bigtheta ) THEN
1164  t = half / theta
1165  cs = one
1166  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1167  $ cs, conjg(ompq)*t )
1168  IF( rsvec ) THEN
1169  CALL zrot( mvl, v(1,p), 1,
1170  $ v(1,q), 1, cs, conjg(ompq)*t )
1171  END IF
1172  sva( q ) = aaqq*sqrt( max( zero,
1173  $ one+t*apoaq*aapq1 ) )
1174  aapp = aapp*sqrt( max( zero,
1175  $ one-t*aqoap*aapq1 ) )
1176  mxsinj = max( mxsinj, abs( t ) )
1177  ELSE
1178 *
1179 * .. choose correct signum for THETA and rotate
1180 *
1181  thsign = -sign( one, aapq1 )
1182  IF( aaqq.GT.aapp0 )thsign = -thsign
1183  t = one / ( theta+thsign*
1184  $ sqrt( one+theta*theta ) )
1185  cs = sqrt( one / ( one+t*t ) )
1186  sn = t*cs
1187  mxsinj = max( mxsinj, abs( sn ) )
1188  sva( q ) = aaqq*sqrt( max( zero,
1189  $ one+t*apoaq*aapq1 ) )
1190  aapp = aapp*sqrt( max( zero,
1191  $ one-t*aqoap*aapq1 ) )
1192 *
1193  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1194  $ cs, conjg(ompq)*sn )
1195  IF( rsvec ) THEN
1196  CALL zrot( mvl, v(1,p), 1,
1197  $ v(1,q), 1, cs, conjg(ompq)*sn )
1198  END IF
1199  END IF
1200  cwork(p) = -cwork(q) * ompq
1201 *
1202  ELSE
1203 * .. have to use modified Gram-Schmidt like transformation
1204  IF( aapp.GT.aaqq ) THEN
1205  CALL zcopy( m, a( 1, p ), 1,
1206  $ cwork(n+1), 1 )
1207  CALL zlascl( 'G', 0, 0, aapp, one,
1208  $ m, 1, cwork(n+1),lda,
1209  $ ierr )
1210  CALL zlascl( 'G', 0, 0, aaqq, one,
1211  $ m, 1, a( 1, q ), lda,
1212  $ ierr )
1213  CALL zaxpy( m, -aapq, cwork(n+1),
1214  $ 1, a( 1, q ), 1 )
1215  CALL zlascl( 'G', 0, 0, one, aaqq,
1216  $ m, 1, a( 1, q ), lda,
1217  $ ierr )
1218  sva( q ) = aaqq*sqrt( max( zero,
1219  $ one-aapq1*aapq1 ) )
1220  mxsinj = max( mxsinj, sfmin )
1221  ELSE
1222  CALL zcopy( m, a( 1, q ), 1,
1223  $ cwork(n+1), 1 )
1224  CALL zlascl( 'G', 0, 0, aaqq, one,
1225  $ m, 1, cwork(n+1),lda,
1226  $ ierr )
1227  CALL zlascl( 'G', 0, 0, aapp, one,
1228  $ m, 1, a( 1, p ), lda,
1229  $ ierr )
1230  CALL zaxpy( m, -conjg(aapq),
1231  $ cwork(n+1), 1, a( 1, p ), 1 )
1232  CALL zlascl( 'G', 0, 0, one, aapp,
1233  $ m, 1, a( 1, p ), lda,
1234  $ ierr )
1235  sva( p ) = aapp*sqrt( max( zero,
1236  $ one-aapq1*aapq1 ) )
1237  mxsinj = max( mxsinj, sfmin )
1238  END IF
1239  END IF
1240 * END IF ROTOK THEN ... ELSE
1241 *
1242 * In the case of cancellation in updating SVA(q), SVA(p)
1243 * .. recompute SVA(q), SVA(p)
1244  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1245  $ THEN
1246  IF( ( aaqq.LT.rootbig ) .AND.
1247  $ ( aaqq.GT.rootsfmin ) ) THEN
1248  sva( q ) = dznrm2( m, a( 1, q ), 1)
1249  ELSE
1250  t = zero
1251  aaqq = one
1252  CALL zlassq( m, a( 1, q ), 1, t,
1253  $ aaqq )
1254  sva( q ) = t*sqrt( aaqq )
1255  END IF
1256  END IF
1257  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1258  IF( ( aapp.LT.rootbig ) .AND.
1259  $ ( aapp.GT.rootsfmin ) ) THEN
1260  aapp = dznrm2( m, a( 1, p ), 1 )
1261  ELSE
1262  t = zero
1263  aapp = one
1264  CALL zlassq( m, a( 1, p ), 1, t,
1265  $ aapp )
1266  aapp = t*sqrt( aapp )
1267  END IF
1268  sva( p ) = aapp
1269  END IF
1270 * end of OK rotation
1271  ELSE
1272  notrot = notrot + 1
1273 *[RTD] SKIPPED = SKIPPED + 1
1274  pskipped = pskipped + 1
1275  ijblsk = ijblsk + 1
1276  END IF
1277  ELSE
1278  notrot = notrot + 1
1279  pskipped = pskipped + 1
1280  ijblsk = ijblsk + 1
1281  END IF
1282 *
1283  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1284  $ THEN
1285  sva( p ) = aapp
1286  notrot = 0
1287  GO TO 2011
1288  END IF
1289  IF( ( i.LE.swband ) .AND.
1290  $ ( pskipped.GT.rowskip ) ) THEN
1291  aapp = -aapp
1292  notrot = 0
1293  GO TO 2203
1294  END IF
1295 *
1296  2200 CONTINUE
1297 * end of the q-loop
1298  2203 CONTINUE
1299 *
1300  sva( p ) = aapp
1301 *
1302  ELSE
1303 *
1304  IF( aapp.EQ.zero )notrot = notrot +
1305  $ min( jgl+kbl-1, n ) - jgl + 1
1306  IF( aapp.LT.zero )notrot = 0
1307 *
1308  END IF
1309 *
1310  2100 CONTINUE
1311 * end of the p-loop
1312  2010 CONTINUE
1313 * end of the jbc-loop
1314  2011 CONTINUE
1315 *2011 bailed out of the jbc-loop
1316  DO 2012 p = igl, min( igl+kbl-1, n )
1317  sva( p ) = abs( sva( p ) )
1318  2012 CONTINUE
1319 ***
1320  2000 CONTINUE
1321 *2000 :: end of the ibr-loop
1322 *
1323 * .. update SVA(N)
1324  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1325  $ THEN
1326  sva( n ) = dznrm2( m, a( 1, n ), 1 )
1327  ELSE
1328  t = zero
1329  aapp = one
1330  CALL zlassq( m, a( 1, n ), 1, t, aapp )
1331  sva( n ) = t*sqrt( aapp )
1332  END IF
1333 *
1334 * Additional steering devices
1335 *
1336  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1337  $ ( iswrot.LE.n ) ) )swband = i
1338 *
1339  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( dble( n ) )*
1340  $ tol ) .AND. ( dble( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1341  GO TO 1994
1342  END IF
1343 *
1344  IF( notrot.GE.emptsw )GO TO 1994
1345 *
1346  1993 CONTINUE
1347 * end i=1:NSWEEP loop
1348 *
1349 * #:( Reaching this point means that the procedure has not converged.
1350  info = nsweep - 1
1351  GO TO 1995
1352 *
1353  1994 CONTINUE
1354 * #:) Reaching this point means numerical convergence after the i-th
1355 * sweep.
1356 *
1357  info = 0
1358 * #:) INFO = 0 confirms successful iterations.
1359  1995 CONTINUE
1360 *
1361 * Sort the singular values and find how many are above
1362 * the underflow threshold.
1363 *
1364  n2 = 0
1365  n4 = 0
1366  DO 5991 p = 1, n - 1
1367  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
1368  IF( p.NE.q ) THEN
1369  temp1 = sva( p )
1370  sva( p ) = sva( q )
1371  sva( q ) = temp1
1372  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1373  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1374  END IF
1375  IF( sva( p ).NE.zero ) THEN
1376  n4 = n4 + 1
1377  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1378  END IF
1379  5991 CONTINUE
1380  IF( sva( n ).NE.zero ) THEN
1381  n4 = n4 + 1
1382  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1383  END IF
1384 *
1385 * Normalize the left singular vectors.
1386 *
1387  IF( lsvec .OR. uctol ) THEN
1388  DO 1998 p = 1, n4
1389 * CALL ZDSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1390  CALL zlascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1391  1998 CONTINUE
1392  END IF
1393 *
1394 * Scale the product of Jacobi rotations.
1395 *
1396  IF( rsvec ) THEN
1397  DO 2399 p = 1, n
1398  temp1 = one / dznrm2( mvl, v( 1, p ), 1 )
1399  CALL zdscal( mvl, temp1, v( 1, p ), 1 )
1400  2399 CONTINUE
1401  END IF
1402 *
1403 * Undo scaling, if necessary (and possible).
1404  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1405  $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1406  $ ( sfmin / skl ) ) ) ) THEN
1407  DO 2400 p = 1, n
1408  sva( p ) = skl*sva( p )
1409  2400 CONTINUE
1410  skl = one
1411  END IF
1412 *
1413  rwork( 1 ) = skl
1414 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1415 * then some of the singular values may overflow or underflow and
1416 * the spectrum is given in this factored representation.
1417 *
1418  rwork( 2 ) = dble( n4 )
1419 * N4 is the number of computed nonzero singular values of A.
1420 *
1421  rwork( 3 ) = dble( n2 )
1422 * N2 is the number of singular values of A greater than SFMIN.
1423 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1424 * that may carry some information.
1425 *
1426  rwork( 4 ) = dble( i )
1427 * i is the index of the last sweep before declaring convergence.
1428 *
1429  rwork( 5 ) = mxaapq
1430 * MXAAPQ is the largest absolute value of scaled pivots in the
1431 * last sweep
1432 *
1433  rwork( 6 ) = mxsinj
1434 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1435 * in the last sweep
1436 *
1437  RETURN
1438 * ..
1439 * .. END OF ZGESVJ
1440 * ..
double precision function dlamch(CMACH)
DLAMCH
Definition: dlamch.f:65
integer function idamax(N, DX, INCX)
IDAMAX
Definition: idamax.f:53
subroutine zcopy(N, ZX, INCX, ZY, INCY)
ZCOPY
Definition: zcopy.f:52
subroutine dlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
DLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: dlascl.f:145
subroutine zswap(N, ZX, INCX, ZY, INCY)
ZSWAP
Definition: zswap.f:52
subroutine zlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
ZLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: zlascl.f:145
subroutine zrot(N, CX, INCX, CY, INCY, C, S)
ZROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors...
Definition: zrot.f:105
subroutine zgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ0 pre-processor for the routine zgesvj.
Definition: zgsvj0.f:220
complex *16 function zdotc(N, ZX, INCX, ZY, INCY)
ZDOTC
Definition: zdotc.f:54
double precision function dznrm2(N, X, INCX)
DZNRM2
Definition: dznrm2.f:56
subroutine zlassq(N, X, INCX, SCALE, SUMSQ)
ZLASSQ updates a sum of squares represented in scaled form.
Definition: zlassq.f:108
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
subroutine zlaset(UPLO, M, N, ALPHA, BETA, A, LDA)
ZLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: zlaset.f:108
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
subroutine zdscal(N, DA, ZX, INCX)
ZDSCAL
Definition: zdscal.f:54
subroutine zgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ1 pre-processor for the routine zgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: zgsvj1.f:238
subroutine zaxpy(N, ZA, ZX, INCX, ZY, INCY)
ZAXPY
Definition: zaxpy.f:53

Here is the call graph for this function:

Here is the caller graph for this function: