![]()
|
VECFEM3 Reference Manual: LINSOL
VECFEM3 Reference Manual: LINSOLType: formatIntroductionThe linear equation solver LINSOL is designed to solve a l -dimensional linear system MAT*x = b with a very large and sparse l x l real matrix MAT . In this manual page we explain the structure of the storage scheme wich is used by LINSOL. In the following we use the term A instead of MAT .Parallel ComputerLINSOL is designed to run optimally on parallel distributed-memory computers. Before the computation can start, the GLOBAL matrix has to be distributed to the local memories of the processors. This is done in the following way:Each process(or) p gets lmatbk(p) consecutive rows of the GLOBAL matrix A starting at row ptrmbk(p) +1. Thus we have a LOCAL lmatbk(p) x l matrix A(p) on every process(or) p . This LOCAL matrix is stored in the sparse matrix storage scheme described in the next chapter. Additionaly, LINSOL assumes a logical column block splitting of the matrix on each process(or). This is explained in the chapter Column Block Splitting . The Storage SchemeThe real LOCAL matrix A(p) on processor p is of size lmatbk(p) x l . It is represented as sum of LOCAL 10 sub-matrices A(i,p) :A(p) = A(1,p) + A(2,p) + ... + A(i,p) + ... + A(10,p) Each submatrix A(i,p) contains one (and only one) special data structure. The vectors within the allowed data structures are named "vector terms". The number of vector terms for A(i,p) is stored in the (virtual) array nvt(i,p) . The different submatrices (and vector terms respectively) can be assembled without cutback to the matrix A(p) . WARNING:Two elements of two different vector terms can have the same entry in the matrix A(p) - this means that the entry is the sum of the two elements. The overlay of elements in the matrix A(p) does not influence the correct computation of the solution of the linear system; but then the normalization does not work in the specified way - this means that you don't get exactly the chosen normalization (except of the normalization by main diagonal). In the following, the index p indicating the process(or) number will be dropped. The integer number typ indicates the type of the LOCAL submatrix. NOTES:The variables iac , iar and the array index are GLOBAL and relate to the GLOBAL matrix A . The variables l and lvt are local and relate to the LOCAL matrix A . The variables adda , indc and indr are pointers and point to the matrix array mat and to the array of indices index respectively. The following examples show the settings of the varables for the first block of the LOCAL matrix on the first process(or)- this means that for the other blocks and (or) on the other processes (processors) the values of the GLOBAL variables ( iac , iar , index ) change.
On each process, all index vectors are stored in the integer array
index
of length
lindex
.
Indc
+1 and
indr
+1 point to the first
element of index vectors relating to vector terms within the array
index
.
The entries of the LOCAL matrix
A
are stored in the real array
mat
of length
lmat
.
Adda
+1 points to the first element
of a vector term within the array
mat
.
The integer numbers
typ, adda, lvt, iac, iar, indc
and
indr
are
handed over in the integer matrix
info(ia1,ia2)
;
ia1>=nvt
and
ia2
>=7 must hold. The values
info(i,1)
,...,
info(i,7)
specify the information for the i-th vector term of
typ=info(i,1)
.
The following table shows the meaning of
info(i,k)
for the different data structures:
Column Block SplittingLINSOL assumes that the matrix is not only partitioned in blocks of rows among the processors (processes), but also that each matrix MAT(p) is partitioned logically in nproc (== number of processors (processes)) column blocks. It must hold: if a vector term of the matrix MAT(p) has its first element in column block i (i=1,..,nproc) - this means that it is between column ptrmbk(i) +1 and ptrmbk(i+1) - ,then ALL elements of the vector term have to be in the column block i.The Symmetrical CaseA symmetrical matrix is stored in a special symmetrical storage scheme which uses the symmetry of the matrix. Matrix A can be represented asA = AA + D + AA(T) where D is the main diagonal of A and AA is a M X M -matrix (not necessarily equal to the upper or lower triangle of A ). AA(T) denotes the transposed matrix of AA . In this case only the matrix AA and the main diagonal (if nonzero) must be handed over.How to Select the Sub-matricesBut there is the problem how to separate the matrix A into the local sub-matrices A(i) with i element of set {1,2,..,11} because of different partitioning possibilities. The optimal way depends strongly on the used computer. We recommend the following strategy to find the partitioning: first look to all 'long' full row or column portions in your matrix; second look to all 'long' packed row or column portions. Then look to all occupied diagonals of A (=diagonal with at least one nonzero element). The long diagonals with many nonzero elements are stored in the scheme FULLDIAG . If there are enough nonzero elements in the diagonal i.e. the packed diagonal is 'long', you should store the diagonal in scheme PDIAG . What 'long' means, depends on your computer. Normally 'long' is the vector length where a computer nearly reaches its peak performance for the special operation (e.g. general triad, linked triad, scalar product, ..). For the remaining elements of the matrix you can select the SKY scheme, whereas the length of one vector term may not exceed l.ExamplesHere follow two examples for the storage scheme in the nonsymmetrical and the symmetrical case.Example (nonsymmetric problem on a monoprocessor)First we look to a nonsymmetric problem. MAT is a 8 X 8 matrix (=>l=8): | 4. 0 2. 0 0 0 5. 0. | | 0 4. -1. 3. 0 0 0 0. | | 0 0 4. -1. 0 -2. 0 0. | MAT = | 3. 0 0 4. -1. 0 0 0. | | 0 0 0 0 4. -1. 3. 0. | | 0 0 0 0 0 4. 0 0. | | 0 0 0 0 0 0 4. 0. | | 0 0 1. 1. 1. 1. 1. 4. | The matrix MAT is subdivided into five vector terms MAT = MAIN + 1 x FULLDIAG + 1 x IXCOL + 1 x SKY + 1 x FULLROW = A(1) + A(2) + A(4) + A(6) + A(7) .We set ia1=nvt=5 and ia2=7. We get
[------A(1)-----------][----A(2)------][----A(4)---][-A(6)][-----A(7)-----] MAT = (4.,4.,4.,4.,4.,4.,4.,4.,-1.,-1.,-1.,-1.,5.,3.,-2.,3.,2.,3.,1.,1.,1.,1.,1.) ^ ^ ^ ^ ^ adda(1)=0 adda(2)=8 adda(3)=12 adda(4)=16 adda(5)=18The index vectors are stored in the array index (lindex=8): [--A(4)-][--A(6)--] index = (7,4,6,1,3,7, 1,5) ^ ^ ^ info(3,6)=0 info(4,4)=4 info(4,6)=6 Example (symmetric problem on a monoprocessor)Now we look to a symmetric problem. MAT is a 8 X 8 matrix (=>l=8):| 4. 0 2. 3. 0 0 5. 0. | | 0 4. -1. 3. 0 0 0 0. | | 2 -1. 4. -1. 0 -2. 0 1. | MAT = | 3. 3. -1. 4. -1. 0 0 1. | | 0 0 0 -1. 4. -1. 3. 1. | | 0 0 -2. 0 -1. 4. 0 1. | | 5. 0 0 0 3. 0 4. 1. | | 0 0 1. 1. 1. 1. 1. 4. |MAT = A + D + A(T), where D is the main diagonal and A(T) is the transposed matrix of A. To store this symmetrical matrix, we can use the storage scheme of example 4.1. All we have to do is to set lsym =.true. ReferencesR. Weiss, H. Haefner, W. Schoenauer: LINSOL (LINear SOLver) - Description and User's Guide for the Parallelized Version (DRAFT Version 0.96); University of Karlsruhe; Computing Center; Internal Report 61/95; 1995.AuthorsProgram by L. Grosz, H. Haefner, C. Roll, P. Sternecker, R. Weiss 1989-96. Please contact L.Grosz. By L. Grosz, Auckland, 4 June 2000. |