News
v2.4 (released 20151028)
 Added factorization of inverse matrix WAZ = I, enabling vector solves using matrix vector mult. instead of forward/backward solves with much better parallel speedup.
 Significant improvements in parallel performance of matrix inversion.
 Improved performance of LU, matrixvector mult., forward/backward solves.
 Added function
nearfield_sparse
to extract 𝓗matrix nearfield as sparse matrix.  Switched to
adaptive_split_axis
as default for clustering.  Minor Changes:
 Additional options for matrix visualization (colormap, etc.)
 Basic VTK output of block clusters.
v2.3
v2.3.2 (released 20150623)
 fixed various bugs and race conditions
 extended ctors of various matrix classes to accept optional value type field
 added example on how to assemble block matrices
v2.3.1 (released 20141112)
 Fixed two bugs in pointwise LU.

Solver changes:

Refactored solver classes (no interface changes); added
TRichardson
to replaceTSolver
in the future.  Fixed inconsistent computation of residual norm in solvers. Now Richardson, CG and BiCG will compute standard residual norm, while MINRES and GMRES compute preconditioned residual norm.

Made initialisation of start vector in solver classes optional (function
initialise_start_value
)

Refactored solver classes (no interface changes); added

Added function
diagonal
to extract diagonal of a matrix.  Added example spectrum to compute spectrum of graph Laplacian (see also documentation).
v2.3 (released 20141027)
 Fixed issues when solving dense matrices (used in new example for many RHSs).

Modified
THiLoFreqGeomAdmCond
: now maximal number of wavelengths per cluster is tested.  Refactored geometrical clustering classes and partitioning strategies, thereby fixing several issues.

C++11 changes:

most object creating functions now return
std::unique_ptr
, 
replaced
typedef
byusing
, 
added iterators for
TIndexSet
,TNodeSet
,TGraph
,TProcSet
(for range basedfor
).

most object creating functions now return
 Added parameter to algebraic clustering in C bindings to define partitioning algorithm (BFS, multi level, METIS or Scotch).
 Fixed issues with progress bar during factorisation (wrong block count).
 Removed BSP style communication functions (MPI only now).

Finished conversion to new
packed_t
SIMD type. Using SSE3 instead of SSE2. 
Added lock to
TScotchAlgPartStrat
because Scotch is not multi thread safe.
v2.2 (released 20140715)

Removing implicit reordering of unknowns during matrixvector multiplication to fix inconsistent behaviour.
Please use permutations from cluster trees or ℋmatrices to reorder vectors or
TPermMatrix
to represent permuted matrices instead.  Speedup improvements for matrix inversion. Triangular inversion and matrix multiplication available in standard user interface.
 Import/export from/to CCS/CRS matrices simplified.
 Simplified (and faster) mutex wrapper.
 Several C++11 changes.
v2.1 (released 20140508)
 Removing reference counters in BLAS interface due to major performance issue on multicore (socket) systems. See documentation on how to use the modified interface (and avoid errors).
 New, scalable matrixvector multiplication implemented.
 Using generic datatype for SIMD instructions, thereby enabling generic SIMD algorithms, e.g. for BEM kernels, and fast adoptation of new SIMD instructions, e.g. AVX2.
 Removed
TVirtualVector
(replaced byTScalarVector
).  and, as usual: several bugs fixed
v2.0
v2.0.2 (released 20140124)
 fixed race condition in C bindings
 fixed issue with initialisation of static variables
v2.0.1 (released 20131205)
 fixed some bugs
v2.0 (released 20130918)

Major Changes
 Switched from OpenMP to Threading Building Blocks as interface to shared memory parallelism, thereby also changing most algorithms to taskbased parallelism.
 Reducing dependency on external libraries by using C++11 features. Also replacing some classes by default C++ versions (finally removing old code).
 Alternative, nonrecursive, levelwise ℋLU factorisation based on explicit block dependencies, which provides far better speedup on manycore systems, e.g. Intel MIC architecture.
 New ℋLU factorisation algorithm also applicable in distributed environments, yielding better loadbalancing (albeit with limited speedup).
 Added support for multiple CPUs to many algorithms, e.g. in clustering, norm computations, matrixvector multiplication and solves, ℋ²convertion.

Minor Changes
 Optimised BEM kernels for Intel MIC architecture.
 Introduced TLinearOperator for operators not supporting TMatrix functionality, e.g. factorised matrices.
 HLIBpro file format changed due to internal changes and due to some bugs in the format. However, backward read compatibility for most files written with earlier versions is kept.
 Added Support for Cairo library, thereby providing PDF output.
 And of course: many smaller feature upgrades and bug fixes.
v1.2 (released 20120223)

Matrix Construction:

Switched to template based coefficient functions (
TCoeffFn
and derived) and all depended classes, e.g.TDenseMBuilder
, SVD and ACA low rank approximation. 
Rewrote HCA:
 Simpler interface containing all neccessary functionality in single class.
 Using template for value type.
 Added base classes for permuted indices and for BEM applications using quadrature.
 Added implementation for Laplace and Helmholtz also for linear ansatz spaces and with support for SSE2 and AVX.
 Cleaned up ACA implementation.
 Changed handling of recompression: should now be handled by default in low rank approximation algorithm and not by matrix construction class (to avoid recompression of optimal results).

Switched to template based coefficient functions (

Clustering Changes:

Added
TNDBSPPartStrat
to be used in connection with nested dissection (trying various clusterings and choosing best for ND). 
Modified
TNDBSPCTBuilder
to more resemble algebraic version, e.g. average depth for interface clusters instead of maximal.  Fixed bug in PCA based clustering and added version for cardinality based clustering.
 Added various flags to modify clustering, e.g. synchronisation of interface depth, enforcing block clusters with same depth of corresponding clusters, using symmetrised weights in algebraic clustering.

Added

Input/Output and visualisation:
 Fixed bug in reading dense matrices.
 Changed order of dimension for coordinate IO using Matlab format: now ncoord × dimension (e.g. as also used by Sparse Matrix Collection).
 Added VTK visualisation for coordinates (with various options, e.g. marking clusters or index connectivity) and BEM grids.
 Added Output of Grids in HLIB format.
 Added coordinate IO in MatrixMarket format.

Changes in LAPACK wrapper:
 added LAPACK workspace queries for optimal workspace size instead of using predefined block size

using
xGESDD
for large matrices
 various bug fixes.
v1.1
v1.1.1 (released 20111129)
 Deactivated default coarsening during matrix construction.
 Added special H² matrix builder with predefined cluster bases.
v1.1 (released 20111128)
 changes in BEM code:
 Added support for AVX.
 Performance speedups in SSE2 implementation of Helmholtz and Maxwell kernels.
 Runtime detection of SSE2/AVX availability and automatic choice of optimal kernel.
 Added
matrix_format
function to matrix coefficient functions to define whether unsymmetric, symmetric or hermitian (default: unsymmetric).
Defaultbuild
function in matrix builders now without format argument.  Added support for ILP64 BLAS/LAPACK implementations (64bit integers).
 Added support for AMDLibM (integrated in binary Linux distributions).
 Added vector IO in MatrixMarket format.
 Cleaned up C++ examples (thereby also removing Boost link dependency).
 several bug fixes
v1.0
v1.0.1 (released 20110928)
 OpenMP exception handling changed: now all threads will stop as soon as possible in case of an error
 fixed several, previously undetected, noncritical compiler warnings (MS Visual C++)
 bug fixes
v1.0 (released 20110701)
HLIBpro v1.0 is a major rewrite/reorganisation of many of the ℋmatrix algorithms.
The following list of changes only covers the main topics and is by far not complete.
 added distributed computing via MPI for matrix construction and factorisation
 added ℋ²matrices
 added internal multilevel graph partitioning for blackbox clustering
 added support for piecewise linear basis functions and Maxwell EFIE/MFIE
 rewrote interface to BLAS/LAPACK
 rewrote C interface with better mapping of internal C++ and C types
 increased robustness of matrix factorisation in case of badconditioned matrices
 increased speedup of matrix factorisation in multithreaded computations
 many performance improvements and bug fixes
v0.13
v0.13.6 (released 20090130)
 added optional diagonal scaling of ℋmatrices during LU factorisation
 added blockwise accuracy, e.g. accuracy depending on current matrix block
 rewrote accuracy handling in C bindings
 simplified BSP partitioning methods and added regular cardinality based and principle component based clustering
 added optional balancing of tree depth in cluster tree construction with predefined partitioning
 implemented optional double precision computation of matrix inversion and lowrank truncation in single precision mode
 fixed bug in calling single precision norm functions of LAPACK
 fixed bug in PostScript output and modified ℋmatrix output in PostScript format
 added support for Jacobi based SVD (
sgejsv
anddgejsv
) in LAPACK v3.2
v0.13.5 (released 20080923)
 removed ID based cluster tree computations in matrices
 always computing SCC in algebraic clustering, also in nested dissection clustering
 reordering clusters depending on size ratio (large first)
 fixed bug with filenames without directories
 fixed nonexception safe OpenMP usage
 added matrix reduction to nearfield part
 added dense lowrank multiplication if result is large dense matrix
v0.13.4 (released 20080424)
 fixed solve functions in
TLU
,TLDL
(checking forNULL
blocks)  fixed OpenMP call with zero threads in
TLU
TLDL
andTMatrixInv
 fixed
operator =
inautoptr
(wrong const)  removed unnecessary checks in
TArray::copy
 fixed recursive call in
restrict_blockdiag
 replaced fixed constants by type dependent constants in
lapack.cc  fixed
TMatrixInv::multiply_diag
when only D is dense
v0.13.3 (released 20080327)
 fixed several warnings from Visual C++ and Intel C++ compilers
 moved all global variables and functions into
HLIB
namespace (exceptxerbla
override)  enabled user defined prefix for functions and types in C interface and added override for namespace name
 reactivated cardinality check when using
HLIB_BSP_AUTO
v0.13.2 (released 20080229)
 replaced threads and mutices by OpenMP (thread start only, no scheduling)
 included log file support in addition to stdout
 added parallel LDL^{T} factorisation (DD and blockdiag only)
 added parallel blockdiag LU factorisation
 added zero approximation during matrix construction (for nearfield only)
 fixed bug in algebraic nested dissection clustering (wrong path length in interface)
v0.13.1 (released 20080204)
 reduced memory consumption/fragmentation in ACA generated matrices with large rank
 added Fiduccia/Mattheyses bisection optimisation for BFS clustering
 added FFT for vectors by implementing support for FFTW3 (optional)
 fixed bug in TBSPPartCTBuilder when using more than two partitions
 fixed potential issues in sorting algorithms
 fixed type issues with
*_bytesize
functions in C interface  fixed bug in PostScript visualisation of matrices if matrix norm is zero
 fixed issues with GCC4.3
 fixed bug in command line parsing of configuration system
 minor modifications to SCons system to increase userfriendliness
v0.13(.0) (released 20071219)
 general Algorithmic Changes
 support for single precision arithmetic; has to be decided before compiling HLIBpro
 made complete C++ functions and classes visible from outside instead of just C interface functions
 rewrote complex arithmetic to distinguish between symmetric and hermitian matrices; added LDL^{H} and LL^{H} factorisations
 inversion now based on LU, thereby reducing memory consumption (roughly halved)
 added computation of the diagonal of the inverse without computing the inverse
 added evaluation of LU, LDL^{T} factorisations (instead of just solving)
 removed pointwise LU and LDL^{T} factorisation (only blocked) to improve robustness with zeroes on diagonal
 added (optional) check and fix for singular sub matrices during inversion and factorisation
 added complex valued HCA
 new version of ACA+
 multiplication C = ADB with diagonal D implemented
 implemented bilinear forms for Helmholtz single and double layer potential
 implemented bilinear form for acoustic scattering
 rewrote algebraic clustering for sparse matrices; added support for Scotch and CHACO
 added support for periodic coordinates in clustering
 added clustering with user defined index partition on first level in cluster tree
 added standard admissibility for algebraic clustering
 added maximal level in clustering to prevent infinite recursion
 modified solvers to handle complex valued data
 added permutation of dense matrices without temporary storage (needed in IO)
 parallel Arithmetic
 added thread parallel algorithms for matrix construction, matrix multiplication, inversion and LU factorisation
 redesigned thread pool, thereby fixing race conditions
 added support for Windows threads
 fixed several issues with thread safety
 Input and Output
 added general I/O functions with autodetection of file format
 added output of matrices in Harwell/Boeing format
 added MatrixMarket format
 added support for Ply and surface mesh format (NetGen) for Grid I/O
 fixed format errors in SAMG output
 conversion of arbitrary matrices to sparse format when writing in SAMG or Harwell/Boeing format
 fixed support for symmetric matrices in Harwell/Boeing format
 C interface
 prefixed all functions, types and constants with
hlib_
(orHLIB_
) to prevent collisions with other definitions (OS or libraries)  added support for C99 complex types (if available)
 added
hlib_set_coarsening
to activate/deactivate coarsening during matrix construction (default: on) and matrix arithmetic (default: off)  added
hlib_matrix_inv_diag
to return diagonal of inverse  added
hlib_matrix_is_complex
to test for real or complex valued matrices  added
hlib_set_nthreads
to set number of threads  added
hlib_coord_t
as special type for coordinates  separated stop criterion and solver in solver interface
 prefixed all functions, types and constants with
 Miscellaneous
 updated CPUflags and Rmalloc
 fixed optimisation issues (leading to infinite loops) in enclosed CLAPACK
v0.12 (released 20061101)
 Algorithmic Changes
 added (blocked) LDL^{T} factorisation (now default for symmetric matrices)
 no longer need extra matrix in matrix inversion
 using ACAFull in HCA (instead of SVD)
 adaptively choosing quadrature and interpolation order in ACA and HCA
 rewrote matrix addition to support general cases, e.g. lowrank to blocked
 rewrote lowrank truncation handling
 support for METIS in algebraic clustering routines
 added basic support for "dense" sparse matrices, e.g. with highly coupled indices
 added SSE2 based HCA algorithm
 added infinity norm for vectors
 using norm of preconditioned residual for all solvers if preconditioner is present
 added MINRES iteration
 using
ADM_AUTO
as default admissibility  finally removed all asserts and replaced by internal error checking
 Input and Output
 VRML97 support
 added Matlab compression (Matlab v7) and structs support
 support for HarwellBoeing matrix format (readonly)
 modified PostScript output of blockwise SVD; now scaled w.r.t. 2norm of matrix
 OS and Library support
 MS Windows support
 shared libraries for Linux and Windows
 changed configure system to better handle MS Windows environment
 added internal
xerbla
to handle LAPACK errors directly
 C interface
 automatic choice of matrix building in
hlib_matrix_build_bem_grid
 introduced
vector_t
as type to vectors (no more C arrays)  added Gauss and Sauter triangle quadrature rules
 added functions to access matrix and vector entries
 added
copyto
andcopyto_eps
functions  added
hlib_matrix_build_dense
to build ℋmatrix from dense matrix  changed solver management
 automatic choice of matrix building in
 Miscellaneous
 several improvements and bug fixes
 cleaned up error codes
 updated CPUflags and Rmalloc
v0.11 (released 20060529)
 Arithmetic
 added ACAFull
 added HCA (hybrid cross approximation)
 complex valued ACA and SVD
 added copy with coarsening for ℋmatrices
 added computation of spectral norm for the inverse of a matrix
 support for permutations in matrixvector multiplication of sparse matrices
 added support for Laplace SLP/DLP and 3D triangle surface grids
 fixed issues with degenerated bounding boxes in geometrical clustering
 Input/Output
 support for PLTMG matrix format
 Miscellaneous
 replaced error handling with exceptions
 added modified CLAPACK as default implementation of LAPACK to HLIBpro
 integrated CPUFlags into configure system
 added function for fast reciprocal square root
v0.10 (released 20060405)
 Arithmetic
 initial support for complex arithmetic
 support for symmetric matrices in arithmetic
 implemented block LU factorisation
 implemented LDL^{T} factorisation
 added Frobenius norm for sparse matrices
 support for CRS format in sparse matrices
 added Jacobi and SOR matrix types (for matrixvector multiplication)
 implemented hierarchical domain decomposition with parallel arithmetics
 Parallel Algorithms
 threadparallel Cholesky factorisation
 threadparallel coarsening of ℋmatrices
 fixed threadparallel LU and inversion
 fixed deadlocks in threadpool
 added direct communication in BSP mode
 parallel addition of matrices and vectors via streams
 Input/Output
 support for Matlab and SAMG format
 Miscellaneous
 introduced C interface functions and types
 added configure system for Makefiles
 added progress meter support for arithmetic
 added internal RTTI system
 support for memory consumption query on HPUX
 rewrote error handling
v0.9 (released 20041130)
 first public version as PHI (Parallel Hmatrix Implementation)
 merged BSPparallel and threadparallel versions of ℋmatrix library