- Introduction

        bls1: sparse svd via hybrid block Lanczos procedure for
              equivalent 2-cyclic eigensystems.

	bls1.c is an ANSI-C code designed to compute singular values
	and singular vectors of a large sparse matrix A.  This is a     
	modified version of the block Lanczos algorithm first
        published by Golub, Luk, and Overton (ACM TOMS 7(2):149-169, 1981).
        This particular implementation is discussed in "Multiprocessor
        Sparse SVD Algorithms and Applications", Ph.D. Thesis by M. Berry,
        University of Illinois at Urbana-Champaign, October 1990.

        This "hybrid" block Lanczos procedure consists of five phases:

        phase 1: Block Lanczos outer iteration to yield a block upper 
                 bidiagonal matrix S which shares the same singular
                 values of the original sparse matrix A.  Total or
                 complete re-orthogonalization is used here.

        phase 2: Lanczos method for bi-diagonalizing the S matrix from
                 phase 1 to yield the bi-diagonal matrix B which preserves
                 the same singular values.  Complete or total re-orthogonal-
                 ization is used for this Lanczos recursion.  A point
                 Lanczos method is used if a blocksize (nb) of 1 is encountered.
                     
        phase 3: Apply an appropriate QR iteration to diagonalize B and 
                 hence produce approximate singular values (array alpha)
                 of the original matrix A.

        phase 4: Convergence test using a user-supplied residual tolerance
                 (tol).

        phase 5: Iteration restart with orthogonal projection with respect
                 to any (all) converged singular vectors.
	
- Calling sequence

	The calling sequence for procedure blklan1 is

        long   blklan1(FILE *fp, long nnzero, long m, long n, long ik, 
               double *v, double *u, double *sing, long ic, long ib, 
               double tol, double *res, long maxit, long *iko, long *ico,
               long *ibo, long *memory)

	The user specifies as part of the parameter list:
	 
        fp              ... a pointer to output file {FILE *}.
	nnzero          ... number of nonzeros in matrix A {long}.
        m               ... row dimension of the sparse matrix A whose SVD
                            is sought {long}.
	n		... column dimension of the sparse matrix a whose SVD
                            is sought {long}.
	ik     	        ... number of singular triplets desired {long}.
	ib     	        ... initial block size for outer iteration {long}.
	ic     	        ... upper bound for dimension of Krylov subspace 
                            generated via outer iteration {long}.  ic is the
                            maximum dimension for the block upper bidiagonal
                            matrix S generated in phase 1 above.
	tol        	... user-specified tolerance for approximate singular
                            triplets {double}.
	maxit           ... maximum number of outer iterations allowed
                            {long}.
	 
	blklan1 returns via its parameter list the following items:
	 
        ik0             ... number of singular triplets approximated {long}.
	ic0    	        ... last bound for dimension of Krylov subspace 
                            used within outer iteration {long}.
	ib0    	        ... final block size used in outer iteration {long}.
	sing            ... one-dimensional array containing the ik0
                            approximate singular values {double}.
	v	        ... two-dimensional array containing the ik0
                            approximate right singular vectors corresponding
                            to the approximate singular values in array
                            sing {double}.
	u	        ... two-dimensional array containing the ik0
                            approximate left singular vectors corresponding
                            to the approximate singular values in array
                            sing {double}.
	res             ... one-dimensional array containing the ik0
                            residuals of the approximate singular triplets
                            {double}.
	memory          ... memory needed in bytes {long}.

- User-supplied routines

        For bls1.c, the user must specify multiplication by matrices
        A and A' separately (in subroutines opa and opat, respectively).

        The specification of opa should look something like

           void opa(long m, long n, double *x, double *y)

        so that opa takes an n by 1 vector x and returns the m by 1
        vector y = A*x, where A is m by n (m >> n).

        The specification of opat should look something like

           void opat(long n, double *x, double *y)

        so that opat takes an m by 1 vector x and returns the n by 1
        vector y = A'*x, where A is m by n (m >> n).

        This version of bls1.c is designed to approximate the ik-largest
        singular triplets of A.  Users interested in the ik-smallest
        singular triplets need only sort the alpha array in increasing
        (as opposed to the default ascending order) following the line

                   qriter2(nn, alpha, beta, qqp, ppp);

        in phase 3 of bls1.c.  The columns of the two-dimensional arrays
        ppp and qqp must be reordered to reflect a one-to-one correspondence
        with the newly sorted elements of alpha (which are approximate 
        singular values of the matrix A).

- Information 

        Please address all questions, comments, or corrections to:

        M. W. Berry
        Department of Computer Science
        University of Tennessee
        107 Ayres Hall
        Knoxville, TN  37996-1301
        email: berry@cs.utk.edu
        phone: (615) 974-5067

-File descriptions

        bls1.c requires the include file bls1.h for compilation.
        The local parameters defined in bls1.h are:

         k        remaining # of desired triplets (done when = 0)
         k0       count of triplets found in current iteration
         nb       current block size
         nc       size of current subspace
         ns       number of blocks in current iteration

        The input and output files associated with bls1.c are
        listed below.

             Code           Input         Output
            ------      ------------    ---------
            bls1.c      blp1, matrix    blo1,blv1


       The binary output file blv1 (which contains the 
       approximate right singular vectors followed by the ap-
       proximate left singular vectors) will be created by
       bls1.c if it does not already exist.  If you are
       running on a Unix-based workstation you should uncomment
       the line

                 /*   #define  UNIX_CREAT */

       in the declarations prior to main() in bls1.c.
       UNIX_CREAT specifies the use of the UNIX "creat" system 
       routine with the permissions defined by the PERMS constant

                  #define PERMS 0664

       You may adjust PERMS for the desired permissions on the
       blv1 file (default is Read/Write for user and group,
       and Read for others).  Subsequent runs will be able to
       open and overwrite these files with the default permissions.

       bls1.c obtains its parameters specifying the
       sparse SVD problem to be solved from the input file
       blp1. This parameter file contains the single line

	 <name>   maxit   nc   nb   nums   tol   vtf

       where 

        <name>     is the name of the data set containing the nonzeros of A.
        maxit      is an integer specifying maximum number of (outer)
                   block Lanczos iterations allowed.
        nc         is an integer specifying the upper bound for the
                   Krylov subspace generated via the outer iteration.
        nb         is an integer specifying the initial block size for 
                   the outer iteration.
        nums       is an integer specifying the number of singular triplets
                   desired.
        tol        is a double specifying the residual tolerance for
                   approximated singular triplets.
        vtf        contains the string TRUE or FALSE to indicate when 
                   singular triplets are needed (TRUE) and when only 
                   singular values are needed (FALSE);  If vtf is TRUE,
                   the unformatted output file blv1 will contain the 
                   approximate singular vectors written in the order 

                      u[1], v[1], u[2], v[2], ..., u[ik0], v[ik0]. 

                   Here u[i] and v[i] denote the left and right
                   singular vectors, respectively, corresponding to the 
                   i-th approximate singular value, sing[i].

- Sparse matrix format

        bls1.c is designed to read input matrices that are stored
        in the Harwell-Boeing sparse matrix format.  The nonzeros
        of such matrices are stored in a compressed column-oriented
        format.  The row indices and corresponding nonzero values
        are stored by columns with a column start index array
        whose entries contain pointers to the nonzero starting each
        column.  bls1.c reads the sparse matrix data from the input
        file called "matrix".

        Each input file "matrix" should begin with a four-line header
        record followed by three more records containing, in order, 
        the column-start pointers, the row indices, and the nonzero
        numerical values.

        The first line of the header consists of a 72-character title
        and an 8-character key by which the matrices are referenced.
        The second line can be used for comments or to indicate record
        length for each index or value array.  Although this line is 
        generally ignored, A CHARACTER MUST BE PLACED ON THAT LINE.
        The third line contains a three-character string denoting the
        matrix type and the three integers specifying the number of rows,
        columns, and nonzeros.  The fourth line which usually contains
        input format for Fortran-77 I/O is ignored by our ANSI-C code.
        The exact format is

		"%72c %*s %*s %*s %d %d %d %*d"

	for the first three lines of the header,

		line 1      <title>         <key>
		 	(col.  1 - 72) (col. 73 - 80)

		line 2   <string>

		line 3   <matrix type> nrow ncol nnzero 

	and 

		"%*s %*s %*s %*s"

	for the last line of the header.

		line 4   <string1> <string2> <string3> <string4>

        Even though only the title and the integers specifying the
        number of rows, columns, and nonzero elements are read, other
        strings of input must be present in indicated positions.
        Otherwise, the format of the "fscanf" statements must be 
        changed accordingly.
