HLIBpro  2.9
Solving a Sparse Equation System

Given a sparse matrix \(A\) and a right hand hand side \(b\), the solution \(x\) of

\[Ax = b\]

is sought, whereby H-LU factorisation shall be used to improve the convergence of an iterative solver. No geometrical data is assumed and hence, algebraic clustering is performed for converting \(A\) into an H-matrix. Furthermore, nested dissection is applied to improve the efficiency of the H-LU factorisation.

Loading the Sparse Matrix

The program starts with the inclusion of the needed header files and the initialisation of ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ.

#include <cstdlib>
#include <iostream>
#include "hlib.hh"
using namespace std;
using namespace HLIB;
int
main ( int argc, char ** argv )
{
try
{
INIT();

The next step is the definition of the sparse matrix, which shall be read from a file. ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ supports various foreign matrix formats, e.g. Matlab, SAMG, Harwell-Boeing or Matrixmarket. In this example, the matrix is stored in Matlab format:

auto M = read_matrix( "M.mat" );

The function read_matrix performs autodetection of the file format and is therefore suited for all supported formats.

Remarks
In the case of Matlab, read_matrix reads in the first matrix found in the file. If you have stored several matrices in one file, you have to use TMatlabMatrixIO and provide the name of the matrix to read.

By using unique_ptr, the corresponding object is automatically deleted when the current block is left, thereby reducing the danger of memory leaks. This technique is extensively used in ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ.

Since any kind of matrix may be stored in the given file, e.g. a dense matrix, but in the following sparse matrices are expected, the type of the matrix is tested:

if ( ! IS_TYPE( M, TSparseMatrix ) )
{
cout << "given matrix is not sparse (" << M->typestr() << ")" << endl;
exit( 1 );
}
auto S = ptrcast( M.get(), TSparseMatrix );

The macro IS_TYPE returns true, if the corresponding object is of the given type. This is an example of the runtime type information system (RTTI) in ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ and works similar to dynamic casts in C++. The advantage of the ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ RTTI is, that further information can be gathered, e.g. the type name through the function typestr(), independent of the corresponding C++ compiler support.

Remarks
The ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ RTTI is also faster than the C++ RTTI since only some numbers are compared but of course, the C++ approach is much more general and needs no special support by the programmer.

Finally, since we know by now that M contains a sparse matrix, we cast the pointer for easier use in the following. The macro ptrcast is just an abbreviation of a C++ cast and was introduced in ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ for ease of use and for debugging purposes (ptrcast is different depending on compilation options).

We now make use of the TSparseMatrix pointer S and may output some information about it:

cout << " matrix has dimension " << S->rows() << " x " << S->cols() << endl;
cout << " no of non-zeroes = " << S->n_non_zero() << endl;
cout << " matrix is " << ( S->is_complex() ? "complex" : "real" )
<< " valued" << endl;
cout << " format = ";
if ( S->is_nonsym() ) cout << "non symmetric" << endl;
else if ( S->is_symmetric() ) cout << "symmetric" << endl;
else if ( S->is_hermitian() ) cout << "hermitian" << endl;
cout << " size of sparse matrix = " << Mem::to_string( S->byte_size() ) << endl;
cout << " |S|_F = " << norm_F( S ) << endl;

This includes the dimension of the matrix, i.e. the number of rows and columns, as well as the number of non zero elements. Furthermore, the value type, real of complex, and matrix properties are printed. Finally, the storage size and the Frobenius norm is computed.

Conversion to H-Format

Having a right hand side, the equation system above could now be solved using an iterative scheme. Unfortunately, most sparse matrices need some form of preconditioning for an acceptable convergence behaviour. The preconditioner shall be in the form of a H-LU factorisation of \(A\) and hence, of S. For this, we need a cluster tree and a block cluster tree.

Since, by assumption, no geometrical data is available, the clustering is performed purely algebraically by using the connectivity relation of the indices stored in the sparse matrix itself. Furthermore, nested dissection is to be used, which needs some special clustering.

Algebraic clustering uses graph partitioning as the underlying technique. ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ implements two graph partitioning algorithms (TBFSAlgPartStart and TMLAlgPartStrat) and supports external algorithms, e.g. METIS and Scotch. Here we use the BFS-based partitioning strategy implemented by TBFSAlgPartStrat.

Nested dissection clustering is based on a standard bissection clustering algorithm, which therefore has to be defined explicitly in the form of TAlgCTBuilder, which on the other hand uses the graph partitioning strategy. Finally, TAlgNDCTBuilder computes the cluster tree for the index set defined by S:

TBFSAlgPartStrat part_strat;
TAlgCTBuilder ct_builder( & part_strat );
TAlgNDCTBuilder nd_ct_builder( & ct_builder );
auto ct = nd_ct_builder.build( S );

Having the cluster tree, the block cluster tree can be computed:

TWeakAlgAdmCond adm_cond( S, ct->perm_i2e() );
TBCBuilder bct_builder;
auto bct = bct_builder.build( ct.get(), ct.get(), & adm_cond );
cout << " sparsity constant = " << bct->compute_c_sp() << endl;

Here, the weak admissibility condition for graphs, implemented by TWeakAlgAdmCond, is used. Since the block clusters to test are based on the internal ordering of the H-matrix but the sparse matrix is defined in the external ordering, the corresponding mappings between both is needed by the admissibility condition.

Remarks
See Parameters for nmin, cluster_level_mode and sync_interface_depth, which may be changed to optimise the partitioning.

The printed sparsity constant gives some hint about the complexity of the H-matrix constructed over the block cluster tree, the smaller, the better.

The final conversion of the sparse matrix into H-format is performed by TSparseMBuilder. Here, again the mappings between internal and external ordering are needed. The accuracy can be usually neglected, since low rank blocks are usually empty (clusters of positive distance usually have no overlapping basis functions).

TSparseMBuilder h_builder( S, ct->perm_i2e(), ct->perm_e2i() );
TTruncAcc acc( real(0.0) );
auto A = h_builder.build( bct.get(), acc );
cout << " size of H-matrix = " << Mem::to_string( A->byte_size() ) << endl;
cout << " |A|_F = " << norm_F( A.get() ) << endl;
{
auto PA = make_unique< TPermMatrix >( ct->perm_i2e(), A.get(), ct->perm_e2i() );
cout << " |S-A|_2 = " << diff_norm_2( S, PA.get() ) << endl;
}

After matrix construction, the size of the H-matrix \(A\) and its norm is printed together with the (relative) norm of the difference \(\|S-A\|_2/\|S\|_2\), indicating any approximation error. For this, the matrix \(A\) should operator with the same ordering of the unknowns as \(S\) and is therefore wrapped in a TPermMatrix object.

H-LU factorisation

The factorisation of A may be computed differently, depending on the format of A, e.g. if it is symmetric or not. In the latter case the standard \(LU\) factorisation is used, implemented in TLU, while in the symmetric case the \(LDL^T\) (or \(LDL^H\) for hermitian matrices) factorisation implemented in TLDL is computed.

The factorisation is performed in place, i.e. the data of A is overwritten by the corresponding factores. This implies, that A is actually not a matrix object after the factorisation since the stored data depends on special handling. For this, linear operators are used with special versions for each factorisation algorithm to permit evaluation of the factorised matrix or its inverse.

const TTruncAcc fac_acc( 1e-4 );
auto A_inv = factorise_inv( A.get(), fac_acc );

The factorisation is performed up to a block-wise accuracy of \(10^{-4}\). Depending on the given matrix this has to be changed to obtain a direct solver (single matrix-vector mult.) or a good preconditioner (small number of iterations).

Some properties of the factorisation are printed next, most notably the inversion error \(\|I-(LU)^{-1}S\|_2\) and the condition estimate \(\|LU\mathbf{1}\|_2\). However, to compare \((LU)^{-1}\) with \(S\), both need to have the same ordering of the indices. Instead of reordering \(S\), the factorised matrix may be wrapped in an object representing a permuted matrix:

auto PA_inv = make_unique< TPermMatrix >( ct->perm_i2e(), A_inv.get(), ct->perm_e2i() );

Here, an operator was built, which first reorders w.r.t. the H-matrix, then applies the factorised matrix and finally reorders back.

Now, the inversion error maybe computed:

cout << " size of LU factor = " << Mem::to_string( A->byte_size() ) << endl
<< " inv. error wrt S = " << inv_approx_2( S, PA_inv.get() )
<< endl
<< " condest(LU) = " << condest( PA_inv.get() ) << endl;

Solving the Equation System

Finally, the equation system can be solved using the inverse of the matrix factorisation as a preconditioner. The right hand side is assumed to be available in Matlab format and is read in using read_vector, which again performs file format auto detection. The solution vector is created using the col_vector function of S, which returns a vector defined over the column index set of S.

TAutoSolver solver;
TSolverInfo sinfo;
TAutoVectorIO vio;
auto b = S->col_vector();
auto x = read_vector( "b.mat" );
solver.solve( S, x.get(), b.get(), PA_inv.get(), & sinfo );
if ( sinfo.converged() )
cout << " converged in " << sinfo.n_iter() << " steps"
<< " with rate " << sinfo.conv_rate()
<< ", |r| = " << sinfo.res_norm() << endl;
else
cout << " not converged in " << sinfo.n_iter() << " steps " << endl;
write_vector( x.get(), "x.mat" );

With write_vector the solution vector to a file, again in Matlab format.

Remarks
File format autodetection when reading a file is performed by actually looking at the content of the file, yielding a robust autodetection algorithm. When writing a file, only the filename, especially the filename extension is used, which may not be unique between different formats (but works in most cases).

The program is finished with the finalisation of ๐–ง๐–ซ๐–จ๐–ก๐—‰๐—‹๐—ˆ and the closing of the try block:

DONE();
}
catch ( Error & e )
{
cout << e.to_string() << endl;
}
return 0;
}

The Plain Program

#include <cstdlib>
#include <iostream>
#include "hlib.hh"
using namespace std;
using namespace HLIB;
int
main ( int argc, char ** argv )
{
try
{
INIT();
auto M = read_matrix( "M.mat" );
if ( ! IS_TYPE( M, TSparseMatrix ) )
{
cout << "given matrix is not sparse (" << M->typestr() << ")" << endl;
exit( 1 );
}
auto S = ptrcast( M.get(), TSparseMatrix );
cout << " matrix has dimension " << S->rows() << " x " << S->cols() << endl
<< " no of non-zeroes = " << S->n_non_zero() << endl
<< " matrix is " << ( S->is_complex() ? "complex" : "real" )
<< " valued" << endl
<< " format = ";
if ( S->is_nonsym() ) cout << "non symmetric" << endl;
else if ( S->is_symmetric() ) cout << "symmetric" << endl;
else if ( S->is_hermitian() ) cout << "hermitian" << endl;
cout << " size of sparse matrix = " << Mem::to_string( S->byte_size() ) << endl;
cout << " |S|_F = " << norm_F( S ) << endl;
TBFSAlgPartStrat part_strat;
TAlgCTBuilder ct_builder( & part_strat );
TAlgNDCTBuilder nd_ct_builder( & ct_builder );
auto ct = nd_ct_builder.build( S );
TWeakAlgAdmCond adm_cond( S, ct->perm_i2e() );
TBCBuilder bct_builder;
auto bct = bct_builder.build( ct.get(), ct.get(), & adm_cond );
cout << " sparsity constant = " << bct->compute_c_sp() << endl;
TSparseMBuilder h_builder( S, ct->perm_i2e(), ct->perm_e2i() );
TTruncAcc acc( real(0.0) );
auto A = h_builder.build( bct.get(), acc );
cout << " size of H-matrix = " << Mem::to_string( A->byte_size() ) << endl;
cout << " |A|_F = " << norm_F( A.get() ) << endl;
{
auto PA = make_unique< TPermMatrix >( ct->perm_i2e(), A.get(), ct->perm_e2i() );
cout << " |S-A|_2 = " << diff_norm_2( S, PA.get() ) << endl;
}
const TTruncAcc fac_acc( 1e-4 );
auto A_inv = factorise_inv( A.get(), fac_acc );
auto PA_inv = make_unique< TPermMatrix >( ct->perm_i2e(), A_inv.get(), ct->perm_e2i() );
cout << " size of LU factor = " << Mem::to_string( A->byte_size() ) << endl
<< " inv. error wrt S = " << inv_approx_2( S, PA_inv.get() )
<< endl
<< " condest(LU) = " << condest( PA_inv.get() ) << endl;
TAutoSolver solver;
TSolverInfo sinfo;
auto b = S->col_vector();
auto x = read_vector( "b.mat" );
solver.solve( S, x.get(), b.get(), PA_inv.get(), & sinfo );
if ( sinfo.converged() )
cout << " converged in " << sinfo.n_iter() << " steps"
<< " with rate " << sinfo.conv_rate()
<< ", |r| = " << sinfo.res_norm() << endl;
else
cout << " not converged in " << sinfo.n_iter() << " steps " << endl;
write_vector( x.get(), "x.mat" );
DONE();
}
catch ( Error & e )
{
cout << e.to_string() << endl;
}
return 0;
}
HLIB::write_vector
void write_vector(const TVector *A, const char *filename)
Write vector to file with automatic choice of file format.
HLIB::TBCBuilder
Recursively build block cluster tree with supplied admissibility condition.
Definition: TBCBuilder.hh:27
HLIB::TBCBuilder::build
virtual std::unique_ptr< TBlockClusterTree > build(const TClusterTree *rowct, const TClusterTree *colct, const TAdmCondition *ac) const
HLIB::TClusterTree::perm_e2i
const TPermutation * perm_e2i() const
return external to internal permutation
Definition: TClusterTree.hh:73
HLIB::TClusterTree::perm_i2e
const TPermutation * perm_i2e() const
return internal to external permutation
Definition: TClusterTree.hh:76
HLIB::read_vector
std::unique_ptr< TVector > read_vector(const char *filename)
Read vector from file with automatic file format detection.
HLIB::TBlockClusterTree::compute_c_sp
uint compute_c_sp() const
compute sparsity constant of tree
Definition: TBlockClusterTree.hh:88
HLIB::TTruncAcc
Defines accuracy for truncation of low rank blocks.
Definition: TTruncAcc.hh:33
HLIB::TSparseMatBuilder
Creates H-matrices out of sparse matrices.
Definition: TMatBuilder.hh:197
HLIB::TAutoSolver
Implements an iterative solver automatically choosing appropriate algorithm based on matrix criteria.
Definition: TAutoSolver.hh:23
HLIB::TAlgCTBuilder
Base class for cluster tree construction algorithms based on graph partitioning with graph defined by...
Definition: TAlgCTBuilder.hh:30
HLIB::factorise_inv
std::unique_ptr< TFacInvMatrix > factorise_inv(TMatrix *A, const TTruncAcc &acc, const fac_options_t &options=fac_options_t())
compute factorisation of A and return inverse operator
HLIB::TBFSAlgPartStrat
Graph partitioning using BFS algorithm and FM optimisation.
Definition: TAlgPartStrat.hh:43
HLIB::TAutoSolver::solve
virtual void solve(const TLinearOperator *A, TVector *x, const TVector *b, const TLinearOperator *W=nullptr, TSolverInfo *info=nullptr) const
solve Aยทx = b with optional preconditioner W
HLIB::TSparseMatrix
Class for a sparse matrix stored in compressed row storage format.
Definition: TSparseMatrix.hh:29
HLIB::TAlgNDCTBuilder
Enhances algebraic clustering by nested dissection.
Definition: TAlgCTBuilder.hh:125
HLIB::read_matrix
std::unique_ptr< TMatrix > read_matrix(const std::string &filename)
Read matrix from file with automatic file format detection.