# Features

## Clustering

Clustering is the first step towards an ℋ-matrix and defines a hierarchical decomposition of the given set of unknowns.

### Geometrical Clustering

Geometrical clustering uses available spatial information, e.g. coordinates, for
each index to decompose the index set. A standard algorithm is *binary space
partitioning*, where the coordinates are geometrically decomposed by a hyperplane
in each step of the algorithm. Several variants exists, which choose the dividing
hyperplane differently:

- geometrically balanced, e.g. maintain equal volume per sub cluster
- cardinality balanced, e.g. maintain equal number of indices per sub cluster
- principal component analysis

### Algebraic Clustering

In case of a sparse matrix without geometric information for each index, a clustering method based on the connectivity information in the matrix is available. HLIBpro implements two algorithms based on breath-first-search and multi-level graph partitioning, combined with partitioning optimisation. Furthermore, HLIBpro supports the external graph partitioning libraries (METIS, SCOTCH).

### Add-On Techniques

In the case of sparse matrices, geometrically neighboured domains can be seperated via *nested
dissection*. Applied to clustering, the generated ℋ-matrix has many large, empty off-diagonal
matrix blocks which remain zero during matrix factorisation. This significantly improves memory
and computational complexity and can be used with geometrical as well as algebraic clustering.

Many applications have an internal block-structure, e.g. where the x-, y- and z-components are treated
individually, that is unknown or hard to detect for HLIBpro. For such cases, a user defined partitioning
may be provided, seperating the indices in a first step and then appling the standard clustering techniques,
e.g. geomtrical or cardinality based clustering. In the same way, it is also possible to enforce individual
indices to *stick together*, e.g. they will never be split during clustering.

## ℋ-Matrix Algebra

### Matrix Construction

Matrix construction is implemented for dense and sparse matrices. While in the latter case the matrix coefficients only need to be copied into the new ℋ-matrix, dense matrices require approximation techniques to maintain linear complexity. Two major methods are available in HLIBpro: adaptive and hybrid cross approximation. Mainly for comparative purposes, also singular value decomposition is implemented, which gives the best approximation but comes with O(n^{3}complexity. During matrix construction, neighboured matrix blocks may be joined together provided that the joined block uses less memory while still maintaining the same approximation. This is called

*coarsening*and may reduce the memory requirements drastically.

#### Adaptive Cross Approximation (ACA)

ACA is a heuristical approach, which successively constructs rank-1 updates based on the magnitude of the matrix coefficients. HLIBpro implements the original ACA algorithm as well as an improved version (ACA+), known to work on a broader range of matrices.

For ACA and ACA+ only the matrix coefficients of the dense matrix are neccessary. These are supplied in the form of a coefficient function by the user. That way, only O(n log n) many coefficients are needed, which greatly reduces the computational complexity.

#### Hybrid Cross Approximation (HCA)

HCA avoids the problems of ACA and is proven to yield an approximatation up to a given accuracy. For this, the matrix coefficients as well as simple integrals (collocation integrals of the generating function of the kernel) have to be supplied by the user. The complexity is again in O(n log n) and in many cases, especially for larger problems better than ACA/ACA+.

#### Dense Approximation

For direct approximation of dense matrices singular value decomposition (SVD), randomized SVD and rank revealing QR is implemented. These methods are also available for low-level algebra, e.g., low-rank truncation.

### ℋ-Matrix Algebra

The standard algebra includes matrix-vector multiplication, matrix addition and matrix multiplication. These algorithms are available for shared memory and on distributed memory systems (except matrix multiplication).

HLIBpro implements matrix inversion and various matrix factorisation techniques, e.g., LU, LDL, Cholesky and WAZ factorisation. The corresponding matrix solve methods for triangular (ℋ-) matrices are available.

#### Accumulator based Arithmetic

In the standard ℋ-arithmetic, updates are applied directly to the destination blocks and as soon as the updates are available.

With accumulator based arithmetic, all updates to a specific matrix block is first
*accumulated* and then applied in a single step. Furthermore, updates follow
the hierarchy of the ℋ-matrix. This significantly reduces the number of low-rank truncations
and hence, the runtime of ℋ-arithmetic.

### Misc.

HLIBpro furthermore, contains functions for computing the Frobenius, the spectral norm and infinity norm of matrices (and their difference, e.g. A-B). In the case of the spectral norm, one may also compute the approximation with respect to the inverse of a matrix, e.g. ∥I-B·A∥₂ where B is the approximate inverse of A. This allows to check the quality of ℋ-factorisation or inversion for solving or preconditioning.

## Parallelisation

The main focus of the parallelisation of the ℋ-matrix arithmetic is on shared memory systems with a task based runtime systems (using Intel TBB). Almost all functions make use of the various CPU cores available in todays processors and many function will achieve an optimal speedup for up to 100 cores (probably more, but no hardware available for tests).

A limited set of routines is also available for distributed memory machines, e.g., matrix construction, matrix-vector multiplication and ℋ-LU for dense matrices, e.g., no support for sparse matrices with nested dissection.

## ℋ²-Matrices

ℋ²-matrices are a special version of ℋ-matrices with an even more compact representation. HLIBpro implements the construction of cluster bases and the conversion of ℋ-matrices to ℋ²-matrices.

## BEM

HLIBpro contains a complete application layer for boundary element methods. It is based on triangular surface grids and provides various function spaces, e.g. piecewise constant and linear as well as edge spaces. Furthermore, bilinear forms for Laplace SLP/DLP, Helmholtz SLP/DLP, Maxwell EFIE/MFIE and acoustic scattering are provided. Optimised implementations of these kernels using SSE3, AVX, AVX2, AVX512 and MIC vector operations are available.

## Iterative Solvers

Iterative algorithms for solving linear equation systems with or without preconditioning are part of HLIBpro:

- Richardson iteration
- CG, CGS and BiCG-Stab iteration
- MINRES and GMRES iteration
- TFQMR iteration

## Input/Output

HLIBpro supports several standard file-formats for matrices, vectors and triangular surface grids.

- HLIBpro format
- support for ℋ-, sparse and dense matrices
- support for dense vectors
- support for cluster bases
- support for triangle grids

- SAMG
- support for sparse matrices and dense vectors
- Matlab
- support for sparse and dense matrices and vectors (possibly in
*structs*) - compression supported with zlib

- support for sparse and dense matrices and vectors (possibly in
- Harwell-Boeing/Rutherford and MatrixMarket
- HDF5 format for dense and low-rank matrices
- Ply, SurfaceMesh and GMSH v2 file format for grids

Cluster trees, block cluster trees, matrices, coordinates, grids and grid functions (BEM) may also be visualised in PostScript, PDF, VTK and VRML (for 3D data) file format, providing various options to modify the output and to display only the interesting information.