News
v2.9 (release 20201119)
 replaced
HLIB::complex
bystd::complex
 removed old DAG interface
 modified pivot strategy of standard ACA (better, more robust convergence)
 using
geqr2
instead ofgeqrf
for QR factorization (slightly faster)  fixes:
 replaced deprecated features of TBB

TGeomGroupCTBuilder>
: fixed handling of offsets  missing instantiation of
BLAS::random
forBLAS::Vector
added  wrong solving flags in
TLDUInvMatrix
 inefficient dependency handling in DAG construction for TLR/TileH
 memory leak in recursive DAG construction fixed
v2.8
v2.8.1 (released 20200327)
 fixed weak admissibility (was actually standard admissibility)
 fixed MBLR clustering (ordering only in one dimension led to extremely rectangular clusters)
 fixed bug in lowrank approximation (wrong type conversion)
 fixed nonSIMD implementation of TExpBF (still had additional factor)
 fixed TDenseCoeffFn in case given dense matrix had nonzero row/column offsets
 replaced
tbb::mutex
andtbb::atomic
bystd
versions since marked obsolete in recent TBB versions
v2.8 (released 20191229)
 improved new DAG generation system (better speed and parallel scalability) and
made it the default system (old version still available with
CFG::dag_version=1
)  moved nongeneric include files into
hpro
subdirectory for better separation with other libraries  added various approximation routines for sums of matrices, operators, e.g., SVD, pairwise SVD, RandSVD, RandLR and ACA for operator sums
 expanded lazy accumulator arithmetic to move all updates to leaves only (evaluating
all updates simultaneously; see
CFG::Arith::lazy_eval
andCFG::Arith::sum_approx
)  added
TZeroMatBuilder
andbuild_zero_mat
to construct empty matrix for given block clusters tree, e.g., as preinitialized result of other Hmatrix operation  parallelization of various routines during clustering, e.g., sorting, etc. (may result in slightly different clustering with different number of CPU cores)
 bug fixes:
 permutation of dense matrix in
TMatBuilder
removed (inconsistent behaviour compared to if Hmatrix is built)  fixed update of aux. data in Hmatrix in
copy_nearfield
 fixed reading of old HLIB files (wrong processor sets)
 fixed return value of
Mem::usage
 fixed various issue when using single precision
 permutation of dense matrix in
v2.7
v2.7.2 (released 20190703)
 fixed various compiler issues with MS Visual C++
 added
build
method for coefficient functions to return dense matrix for given index set  C bindings:
 added
hlib_matrix_to_dense/rank
 added
hlib_matrix_approx_rank
to compute lowrank approximation of given matrix with different methods
 added
 fixed issue in
TBSPNDCTBuilder
when no interface is present  fixed issue in
HLIB::Mem::usage
v2.7.1 (released 20190218)
 fixed issue with HDF5 library but removed support from binary distribution due to linking problems with newer version of libHDF5
 added missing functions to sequential
NET
interface  added functions to directly set LR matrices
in
TRkMatrix
 using internal grid generation also in example code (laplace/helmholtz)
 additional spherical grid (different start grid for "inbetween" steps)
 improved coordinate visualization in PostScript format (better minimal distance estimate)
v2.7 (released 20181128)
 new DAG generation based on recursive algorithms with automatic deduction of dependencies between
nodes (default: previous DAG; see
CFG::dag_version
)  new coefficient function for Matern kernel (
TMaternCovCoeffFn
) and exponential bilinear form (TExpBF
)  functions for computing lowrank approximations for sum of matrices directly (using pairwise SVD
approx_sum_svd
, randomized SVDapprox_sum_randsvd
or randomized lowrank approxapprox_sum_randlr
.)  ACA:
 modified stop criterion for ACA (user controllable maximal rank
CFG::Build::aca_max_ratio
)  added dense fallback for ACA if not converging with only computing those coefficients, not yet computed
 modified stop criterion for ACA (user controllable maximal rank
 added MBLR cluster tree construction (
TMBLRCTBuilder
)  modified handling of matrix coefficient functions, especially
TPermCoeffFn
 (limited) grid generation and refinement in BEM library
 using libmvec for sin, cos and exp if available (glibc v2.22 and up) with significant speedups in complex valued computations
 new academic license without any user/host or date limitation
v2.6 (released 20171013)
 Accumulator based Harithmetic reducing number of truncations with support for lazy and eager evaluation
 added randomized SVD and implemented dense approximation and lowrank truncation for all types (SVD, RRQR and RandSVD)
 also added lowrank approximation algorithms for RRQR and RandSVD for Hconstruction
(
TRRQRLRApx
,TRandSVDLRApx
)  support for special flat Hhierarchy with optimised arithmetic functions, e.g., inplace inversion.
 support for block refinement during matrix construction, e.g., if admissibility gives false positives
 added infinity matrix norm (
TInfinityNorm
andnorm_inf()
)  implemented
TOffDiagAdmCond
with all offdiagonal blocks being admissible  massive code restructuring and cleanup
 initial support for HDF5 matrix IO (dense and lowrank)
 support for VSX instruction set (POWER CPUs)
 special handling for all BLAS functions in case of parallel Intel MKL
 added parameter configuration with config files
 added some functions to simplify solver stopping criterion
 changed behaviour/incompatibilities with previous versions:
 removed all permutation handling from
THMatrix
andTNearfieldMulVec
 no recompression in ACA/HCA (now only in matrix builders)

ptrcast()
now consistent withcptrcast()
, i.e., no*
needed  some parameter reorganization
 previous
TMatrix::copy_struct
is renamed toTMatrix::copy_struct_from
(TMatrix::copy_struct
will now return a matrix copy without data)
v2.5
v2.5.1 (released 20170203)
 added operator for matrix sum (
TMatrixSum
) in addition to matrix products  missing bindings in C interface (matrix product/sum,
apply, apply_add
)  fixed issue in clustering with predefined partition with many groups
 support for more matrix formats and more robust IO (if files do not follow standard) for HarwellBoeing/MatrixMarket format
 reenabled parallel block cluster tree construction on shared memory
 bug fixes:
 in
solve_diag_left_block
 matrix solves in
TLLInvMatrix
 in
v2.5 (released 20160902)
 implemented rank revealing QR based lowrank truncation
 Solvers:
 added CGS and TFQMR solvers
 added support for matrix solves in linear iteration (also \mcHmatrices!)
 optional computation of exact residual during iteration
 simplified handling of stop criterion parameters
 using status field in TSolverInfo instead of exception if solver fails (e.g., breakdown)
 some code restructuring
 support for blockwise Jacobi and GaussSeidel operators
 support for AVX512 instruction set
 many new user controllable parameters
 misc.:
 handling of diagonal in factorisation (inverse or normal) now a runtime option (default: inverse)
 added optional distance for TWeakAlgAdmCond to support distances other than one
 support for fixed rank 0
 correct progress bar support for WAZ factorisation and inversion
 some reorganization of source/header files
 C bindings:
 added
hlib_admcond_geom_hilo
forTHiLoFreqGeomAdmCond
 additional parameter for
blockdiag
functions (blocksize)
 added
 fixed serious issue with Intel TBB and with current Intel TBBbased Intel MKL
 various bug fixes
v2.4 (released 20151028)
 Added factorization of inverse matrix WAZ = I, enabling vector solves using matrix vector mult. instead of forward/backward solves with much better parallel speedup.
 Significant improvements in parallel performance of matrix inversion.
 Improved performance of LU, matrixvector mult., forward/backward solves.
 Added function
nearfield_sparse
to extract 𝓗matrix nearfield as sparse matrix.  Switched to
adaptive_split_axis
as default for clustering.  Minor Changes:
 Additional options for matrix visualization (colormap, etc.)
 Basic VTK output of block clusters.
v2.3
v2.3.2 (released 20150623)
 fixed various bugs and race conditions
 extended ctors of various matrix classes to accept optional value type field
 added example on how to assemble block matrices
v2.3.1 (released 20141112)
 Fixed two bugs in pointwise LU.

Solver changes:

Refactored solver classes (no interface changes); added
TRichardson
to replaceTSolver
in the future.  Fixed inconsistent computation of residual norm in solvers. Now Richardson, CG and BiCG will compute standard residual norm, while MINRES and GMRES compute preconditioned residual norm.

Made initialisation of start vector in solver classes optional (function
initialise_start_value
)

Refactored solver classes (no interface changes); added

Added function
diagonal
to extract diagonal of a matrix.  Added example spectrum to compute spectrum of graph Laplacian (see also documentation).
v2.3 (released 20141027)
 Fixed issues when solving dense matrices (used in new example for many RHSs).

Modified
THiLoFreqGeomAdmCond
: now maximal number of wavelengths per cluster is tested.  Refactored geometrical clustering classes and partitioning strategies, thereby fixing several issues.

C++11 changes:

most object creating functions now return
std::unique_ptr
, 
replaced
typedef
byusing
, 
added iterators for
TIndexSet
,TNodeSet
,TGraph
,TProcSet
(for range basedfor
).

most object creating functions now return
 Added parameter to algebraic clustering in C bindings to define partitioning algorithm (BFS, multi level, METIS or Scotch).
 Fixed issues with progress bar during factorisation (wrong block count).
 Removed BSP style communication functions (MPI only now).

Finished conversion to new
packed_t
SIMD type. Using SSE3 instead of SSE2. 
Added lock to
TScotchAlgPartStrat
because Scotch is not multi thread safe.
v2.2 (released 20140715)

Removing implicit reordering of unknowns during matrixvector multiplication to fix inconsistent
behaviour.
Please use permutations from cluster trees or ℋmatrices to reorder vectors or
TPermMatrix
to represent permuted matrices instead.  Speedup improvements for matrix inversion. Triangular inversion and matrix multiplication available in standard user interface.
 Import/export from/to CCS/CRS matrices simplified.
 Simplified (and faster) mutex wrapper.
 Several C++11 changes.
v2.1 (released 20140508)
 Removing reference counters in BLAS interface due to major performance issue on multicore (socket) systems. See documentation on how to use the modified interface (and avoid errors).
 New, scalable matrixvector multiplication implemented.
 Using generic datatype for SIMD instructions, thereby enabling generic SIMD algorithms, e.g. for BEM kernels, and fast adoptation of new SIMD instructions, e.g. AVX2.
 Removed
TVirtualVector
(replaced byTScalarVector
).  and, as usual: several bugs fixed
v2.0
v2.0.2 (released 20140124)
 fixed race condition in C bindings
 fixed issue with initialisation of static variables
v2.0.1 (released 20131205)
 fixed some bugs
v2.0 (released 20130918)

Major Changes
 Switched from OpenMP to Threading Building Blocks as interface to shared memory parallelism, thereby also changing most algorithms to taskbased parallelism.
 Reducing dependency on external libraries by using C++11 features. Also replacing some classes by default C++ versions (finally removing old code).
 Alternative, nonrecursive, levelwise ℋLU factorisation based on explicit block dependencies, which provides far better speedup on manycore systems, e.g. Intel MIC architecture.
 New ℋLU factorisation algorithm also applicable in distributed environments, yielding better loadbalancing (albeit with limited speedup).
 Added support for multiple CPUs to many algorithms, e.g. in clustering, norm computations, matrixvector multiplication and solves, ℋ²convertion.

Minor Changes
 Optimised BEM kernels for Intel MIC architecture.
 Introduced TLinearOperator for operators not supporting TMatrix functionality, e.g. factorised matrices.
 HLIBpro file format changed due to internal changes and due to some bugs in the format. However, backward read compatibility for most files written with earlier versions is kept.
 Added Support for Cairo library, thereby providing PDF output.
 And of course: many smaller feature upgrades and bug fixes.
v1.2 (released 20120223)

Matrix Construction:

Switched to template based coefficient functions (
TCoeffFn
and derived) and all depended classes, e.g.TDenseMBuilder
, SVD and ACA low rank approximation. 
Rewrote HCA:
 Simpler interface containing all neccessary functionality in single class.
 Using template for value type.
 Added base classes for permuted indices and for BEM applications using quadrature.
 Added implementation for Laplace and Helmholtz also for linear ansatz spaces and with support for SSE2 and AVX.
 Cleaned up ACA implementation.
 Changed handling of recompression: should now be handled by default for low rank approximation algorithm and not by matrix construction class (to avoid recompression of optimal results).

Switched to template based coefficient functions (

Clustering Changes:

Added
TNDBSPPartStrat
to be used in connection with nested dissection (trying various clusterings and choosing best for ND). 
Modified
TNDBSPCTBuilder
to more resemble algebraic version, e.g. average depth for interface clusters instead of maximal.  Fixed bug in PCA based clustering and added version for cardinality based clustering.
 Added various flags to modify clustering, e.g. synchronisation of interface depth, enforcing block clusters with same depth of corresponding clusters, using symmetrised weights in algebraic clustering.

Added

Input/Output and visualisation:
 Fixed bug in reading dense matrices.
 Changed order of dimension for coordinate IO using Matlab format: now ncoord × dimension (e.g. as also used by Sparse Matrix Collection).
 Added VTK visualisation for coordinates (with various options, e.g. marking clusters or index connectivity) and BEM grids.
 Added Output of Grids in HLIB format.
 Added coordinate IO in MatrixMarket format.

Changes in LAPACK wrapper:
 added LAPACK workspace queries for optimal workspace size instead of using predefined block size

using
xGESDD
for large matrices
 various bug fixes.
v1.1
v1.1.1 (released 20111129)
 Deactivated default coarsening during matrix construction.
 Added special H² matrix builder with predefined cluster bases.
v1.1 (released 20111128)
 changes in BEM code:
 Added support for AVX.
 Performance speedups in SSE2 implementation of Helmholtz and Maxwell kernels.
 Runtime detection of SSE2/AVX availability and automatic choice of optimal kernel.
 Added
matrix_format
function to matrix coefficient functions to define whether unsymmetric, symmetric or hermitian (default: unsymmetric).
Defaultbuild
function in matrix builders now without format argument.  Added support for ILP64 BLAS/LAPACK implementations (64bit integers).
 Added support for AMDLibM (integrated in binary Linux distributions).
 Added vector IO in MatrixMarket format.
 Cleaned up C++ examples (thereby also removing Boost link dependency).
 several bug fixes
v1.0
v1.0.1 (released 20110928)
 OpenMP exception handling changed: now all threads will stop as soon as possible in case of an error
 fixed several, previously undetected, noncritical compiler warnings (MS Visual C++)
 bug fixes
v1.0 (released 20110701)
HLIBpro v1.0 is a major rewrite/reorganisation of many of the 𝓗matrix algorithms.
The following list of changes only covers the main topics and is by far not complete.
 added distributed computing via MPI for matrix construction and factorisation
 added 𝓗²matrices
 added internal multilevel graph partitioning for blackbox clustering
 added support for piecewise linear basis functions and Maxwell EFIE/MFIE
 rewrote interface to BLAS/LAPACK
 rewrote C interface with better mapping of internal C++ and C types
 increased robustness of matrix factorisation in case of badconditioned matrices
 increased speedup of matrix factorisation in multithreaded computations
 many performance improvements and bug fixes
v0.13
v0.13.6 (released 20090130)
 added optional diagonal scaling of 𝓗matrices during LU factorisation
 added blockwise accuracy, e.g. accuracy depending on current matrix block
 rewrote accuracy handling in C bindings
 simplified BSP partitioning methods and added regular cardinality based and principle component based clustering
 added optional balancing of tree depth in cluster tree construction with predefined partitioning
 implemented optional double precision computation of matrix inversion and lowrank truncation in single precision mode
 fixed bug in calling single precision norm functions of LAPACK
 fixed bug in PostScript output and modified 𝓗matrix output in PostScript format
 added support for Jacobi based SVD (
sgejsv
anddgejsv
) in LAPACK v3.2
v0.13.5 (released 20080923)
 removed ID based cluster tree computations in matrices
 always computing SCC in algebraic clustering, also in nested dissection clustering
 reordering clusters depending on size ratio (large first)
 fixed bug with filenames without directories
 fixed nonexception safe OpenMP usage
 added matrix reduction to nearfield part
 added dense lowrank multiplication if result is large dense matrix
v0.13.4 (released 20080424)
 fixed solve functions in
TLU
,TLDL
(checking forNULL
blocks)  fixed OpenMP call with zero threads in
TLU
TLDL
andTMatrixInv
 fixed
operator =
inautoptr
(wrong const)  removed unnecessary checks in
TArray::copy
 fixed recursive call in
restrict_blockdiag
 replaced fixed constants by type dependent constants in
lapack.cc  fixed
TMatrixInv::multiply_diag
when only D is dense
v0.13.3 (released 20080327)
 fixed several warnings from Visual C++ and Intel C++ compilers
 moved all global variables and functions into
HLIB
namespace (exceptxerbla
override)  enabled user defined prefix for functions and types in C interface and added override for namespace name
 reactivated cardinality check when using
HLIB_BSP_AUTO
v0.13.2 (released 20080229)
 replaced threads and mutices by OpenMP (thread start only, no scheduling)
 included log file support in addition to stdout
 added parallel LDL^{T} factorisation (DD and blockdiag only)
 added parallel blockdiag LU factorisation
 added zero approximation during matrix construction (for nearfield only)
 fixed bug in algebraic nested dissection clustering (wrong path length in interface)
v0.13.1 (released 20080204)
 reduced memory consumption/fragmentation in ACA generated matrices with large rank
 added Fiduccia/Mattheyses bisection optimisation for BFS clustering
 added FFT for vectors by implementing support for FFTW3 (optional)
 fixed bug in TBSPPartCTBuilder when using more than two partitions
 fixed potential issues in sorting algorithms
 fixed type issues with
*_bytesize
functions in C interface  fixed bug in PostScript visualisation of matrices if matrix norm is zero
 fixed issues with GCC4.3
 fixed bug in command line parsing of configuration system
 minor modifications to SCons system to increase userfriendliness
v0.13(.0) (released 20071219)
 general Algorithmic Changes
 support for single precision arithmetic; has to be decided before compiling HLIBpro
 made complete C++ functions and classes visible from outside instead of just C interface functions
 rewrote complex arithmetic to distinguish between symmetric and hermitian matrices; added LDL^{H} and LL^{H} factorisations
 inversion now based on LU, thereby reducing memory consumption (roughly halved)
 added computation of the diagonal of the inverse without computing the inverse
 added evaluation of LU, LDL^{T} factorisations (instead of just solving)
 removed pointwise LU and LDL^{T} factorisation (only blocked) to improve robustness with zeroes on diagonal
 added (optional) check and fix for singular sub matrices during inversion and factorisation
 added complex valued HCA
 new version of ACA+
 multiplication C = ADB with diagonal D implemented
 implemented bilinear forms for Helmholtz single and double layer potential
 implemented bilinear form for acoustic scattering
 rewrote algebraic clustering for sparse matrices; added support for Scotch and CHACO
 added support for periodic coordinates in clustering
 added clustering with user defined index partition on first level in cluster tree
 added standard admissibility for algebraic clustering
 added maximal level in clustering to prevent infinite recursion
 modified solvers to handle complex valued data
 added permutation of dense matrices without temporary storage (needed in IO)
 parallel Arithmetic
 added thread parallel algorithms for matrix construction, matrix multiplication, inversion and LU factorisation
 redesigned thread pool, thereby fixing race conditions
 added support for Windows threads
 fixed several issues with thread safety
 Input and Output
 added general I/O functions with autodetection of file format
 added output of matrices in Harwell/Boeing format
 added MatrixMarket format
 added support for Ply and surface mesh format (NetGen) for Grid I/O
 fixed format errors in SAMG output
 conversion of arbitrary matrices to sparse format when writing in SAMG or Harwell/Boeing format
 fixed support for symmetric matrices in Harwell/Boeing format
 C interface
 prefixed all functions, types and constants with
hlib_
(orHLIB_
) to prevent collisions with other definitions (OS or libraries)  added support for C99 complex types (if available)
 added
hlib_set_coarsening
to activate/deactivate coarsening during matrix construction (default: on) and matrix arithmetic (default: off)  added
hlib_matrix_inv_diag
to return diagonal of inverse  added
hlib_matrix_is_complex
to test for real or complex valued matrices  added
hlib_set_nthreads
to set number of threads  added
hlib_coord_t
as special type for coordinates  separated stop criterion and solver in solver interface
 prefixed all functions, types and constants with
 Miscellaneous
 updated CPUflags and Rmalloc
 fixed optimisation issues (leading to infinite loops) in enclosed CLAPACK
v0.12 (released 20061101)
 Algorithmic Changes
 added (blocked) LDL^{T} factorisation (now default for symmetric matrices)
 no longer need extra matrix in matrix inversion
 using ACAFull in HCA (instead of SVD)
 adaptively choosing quadrature and interpolation order in ACA and HCA
 rewrote matrix addition to support general cases, e.g. lowrank to blocked
 rewrote lowrank truncation handling
 support for METIS in algebraic clustering routines
 added basic support for "dense" sparse matrices, e.g. with highly coupled indices
 added SSE2 based HCA algorithm
 added infinity norm for vectors
 using norm of preconditioned residual for all solvers if preconditioner is present
 added MINRES iteration
 using
ADM_AUTO
as default admissibility  finally removed all asserts and replaced by internal error checking
 Input and Output
 VRML97 support
 added Matlab compression (Matlab v7) and structs support
 support for HarwellBoeing matrix format (readonly)
 modified PostScript output of blockwise SVD; now scaled w.r.t. 2norm of matrix
 OS and Library support
 MS Windows support
 shared libraries for Linux and Windows
 changed configure system to better handle MS Windows environment
 added internal
xerbla
to handle LAPACK errors directly
 C interface
 automatic choice of matrix building in
hlib_matrix_build_bem_grid
 introduced
vector_t
as type to vectors (no more C arrays)  added Gauss and Sauter triangle quadrature rules
 added functions to access matrix and vector entries
 added
copyto
andcopyto_eps
functions  added
hlib_matrix_build_dense
to build 𝓗matrix from dense matrix  changed solver management
 automatic choice of matrix building in
 Miscellaneous
 several improvements and bug fixes
 cleaned up error codes
 updated CPUflags and Rmalloc
v0.11 (released 20060529)
 Arithmetic
 added ACAFull
 added HCA (hybrid cross approximation)
 complex valued ACA and SVD
 added copy with coarsening for 𝓗matrices
 added computation of spectral norm for the inverse of a matrix
 support for permutations in matrixvector multiplication of sparse matrices
 added support for Laplace SLP/DLP and 3D triangle surface grids
 fixed issues with degenerated bounding boxes in geometrical clustering
 Input/Output
 support for PLTMG matrix format
 Miscellaneous
 replaced error handling with exceptions
 added modified CLAPACK as default implementation of LAPACK to HLIBpro
 integrated CPUFlags into configure system
 added function for fast reciprocal square root
v0.10 (released 20060405)
 Arithmetic
 initial support for complex arithmetic
 support for symmetric matrices in arithmetic
 implemented block LU factorisation
 implemented LDL^{T} factorisation
 added Frobenius norm for sparse matrices
 support for CRS format in sparse matrices
 added Jacobi and SOR matrix types (for matrixvector multiplication)
 implemented hierarchical domain decomposition with parallel arithmetics
 Parallel Algorithms
 threadparallel Cholesky factorisation
 threadparallel coarsening of 𝓗matrices
 fixed threadparallel LU and inversion
 fixed deadlocks in threadpool
 added direct communication in BSP mode
 parallel addition of matrices and vectors via streams
 Input/Output
 support for Matlab and SAMG format
 Miscellaneous
 introduced C interface functions and types
 added configure system for Makefiles
 added progress meter support for arithmetic
 added internal RTTI system
 support for memory consumption query on HPUX
 rewrote error handling
v0.9 (released 20041130)
 first public version as PHI (Parallel Hmatrix Implementation)
 merged BSPparallel and threadparallel versions of 𝓗matrix library