News

v2.7 (released 2018-11-28)

  • new DAG generation based on recursive algorithms with automatic deduction of dependencies between nodes (default: previous DAG; see CFG::dag_version)
  • new coefficient function for Matern kernel (TMaternCovCoeffFn) and exponential bilinear form (TExpBF)
  • functions for computing low-rank approximations for sum of matrices directly (using pair-wise SVD approx_sum_svd, randomized SVD approx_sum_randsvd or randomized low-rank approx approx_sum_randlr.)
  • ACA:
    • modified stop criterion for ACA (user controllable maximal rank CFG::Build::aca_max_ratio)
    • added dense fallback for ACA if not converging with only computing those coefficients, not yet computed
  • added MBLR cluster tree construction (TMBLRCTBuilder)
  • modified handling of matrix coefficient functions, especially TPermCoeffFn
  • (limited) grid generation and refinement in BEM library
  • using libmvec for sin, cos and exp if available (glibc v2.22 and up) with significant speedups in complex valued computations
  • new academic license without any user/host or date limitation

v2.6 (released 2017-10-13)

  • Accumulator based H-arithmetic reducing number of truncations with support for lazy and eager evaluation
  • added randomized SVD and implemented dense approximation and lowrank truncation for all types (SVD, RRQR and Rand-SVD)
  • also added lowrank approximation algorithms for RRQR and Rand-SVD for H-construction (TRRQRLRApx, TRandSVDLRApx)
  • support for special flat H-hierarchy with optimised arithmetic functions, e.g., in-place inversion.
  • support for block refinement during matrix construction, e.g., if admissibility gives false positives
  • added infinity matrix norm (TInfinityNorm and norm_inf())
  • implemented TOffDiagAdmCond with all off-diagonal blocks being admissible
  • massive code restructuring and cleanup
  • initial support for HDF5 matrix IO (dense and lowrank)
  • support for VSX instruction set (POWER CPUs)
  • special handling for all BLAS functions in case of parallel Intel MKL
  • added parameter configuration with config files
  • added some functions to simplify solver stopping criterion
  • changed behaviour/incompatibilities with previous versions:
    • removed all permutation handling from THMatrix and TNearfieldMulVec
    • no recompression in ACA/HCA (now only in matrix builders)
    • ptrcast() now consistent with cptrcast(), i.e., no * needed
    • some parameter reorganization
    • previous TMatrix::copy_struct is renamed to TMatrix::copy_struct_from (TMatrix::copy_struct will now return a matrix copy without data)

v2.5

v2.5.1 (released 2017-02-03)

  • added operator for matrix sum (TMatrixSum) in addition to matrix products
  • missing bindings in C interface (matrix product/sum, apply, apply_add)
  • fixed issue in clustering with predefined partition with many groups
  • support for more matrix formats and more robust IO (if files do not follow standard) for Harwell-Boeing/Matrix-Market format
  • re-enabled parallel block cluster tree construction on shared memory
  • bug fixes:
    • in solve_diag_left_block
    • matrix solves in TLLInvMatrix

v2.5 (released 2016-09-02)

  • implemented rank revealing QR based low-rank truncation
  • Solvers:
    • added CGS and TFQMR solvers
    • added support for matrix solves in linear iteration (also \mcH-matrices!)
    • optional computation of exact residual during iteration
    • simplified handling of stop criterion parameters
    • using status field in TSolverInfo instead of exception if solver fails (e.g., breakdown)
    • some code restructuring
  • support for block-wise Jacobi and Gauss-Seidel operators
  • support for AVX512 instruction set
  • many new user controllable parameters
  • misc.:
    • handling of diagonal in factorisation (inverse or normal) now a runtime option (default: inverse)
    • added optional distance for TWeakAlgAdmCond to support distances other than one
    • support for fixed rank 0
    • correct progress bar support for WAZ factorisation and inversion
    • some reorganization of source/header files
  • C bindings:
    • added hlib_admcond_geom_hilo for THiLoFreqGeomAdmCond
    • additional parameter for blockdiag functions (blocksize)
  • fixed serious issue with Intel TBB and with current Intel TBB-based Intel MKL
  • various bug fixes

v2.4 (released 2015-10-28)

  • Added factorization of inverse matrix WAZ = I, enabling vector solves using matrix vector mult. instead of forward/backward solves with much better parallel speedup.
  • Significant improvements in parallel performance of matrix inversion.
  • Improved performance of LU, matrix-vector mult., forward/backward solves.
  • Added function nearfield_sparse to extract 𝓗-matrix nearfield as sparse matrix.
  • Switched to adaptive_split_axis as default for clustering.
  • Minor Changes:
    • Additional options for matrix visualization (colormap, etc.)
    • Basic VTK output of block clusters.

v2.3

v2.3.2 (released 2015-06-23)

  • fixed various bugs and race conditions
  • extended ctors of various matrix classes to accept optional value type field
  • added example on how to assemble block matrices

v2.3.1 (released 2014-11-12)

  • Fixed two bugs in point-wise LU.
  • Solver changes:
    • Refactored solver classes (no interface changes); added TRichardson to replace TSolver in the future.
    • Fixed inconsistent computation of residual norm in solvers. Now Richardson, CG and BiCG will compute standard residual norm, while MINRES and GMRES compute preconditioned residual norm.
    • Made initialisation of start vector in solver classes optional (function initialise_start_value)
  • Added function diagonal to extract diagonal of a matrix.
  • Added example spectrum to compute spectrum of graph Laplacian (see also documentation).

v2.3 (released 2014-10-27)

  • Fixed issues when solving dense matrices (used in new example for many RHSs).
  • Modified THiLoFreqGeomAdmCond: now maximal number of wavelengths per cluster is tested.
  • Refactored geometrical clustering classes and partitioning strategies, thereby fixing several issues.
  • C++11 changes:
    • most object creating functions now return std::unique_ptr,
    • replaced typedef by using,
    • added iterators for TIndexSet, TNodeSet, TGraph, TProcSet (for range based for).
  • Added parameter to algebraic clustering in C bindings to define partitioning algorithm (BFS, multi level, METIS or Scotch).
  • Fixed issues with progress bar during factorisation (wrong block count).
  • Removed BSP style communication functions (MPI only now).
  • Finished conversion to new packed_t SIMD type. Using SSE3 instead of SSE2.
  • Added lock to TScotchAlgPartStrat because Scotch is not multi thread safe.

v2.2 (released 2014-07-15)

  • Removing implicit reordering of unknowns during matrix-vector multiplication to fix inconsistent behaviour. Please use permutations from cluster trees or ℋ-matrices to reorder vectors or TPermMatrix to represent permuted matrices instead.
  • Speedup improvements for matrix inversion. Triangular inversion and matrix multiplication available in standard user interface.
  • Import/export from/to CCS/CRS matrices simplified.
  • Simplified (and faster) mutex wrapper.
  • Several C++11 changes.

v2.1 (released 2014-05-08)

  • Removing reference counters in BLAS interface due to major performance issue on multi-core (-socket) systems. See documentation on how to use the modified interface (and avoid errors).
  • New, scalable matrix-vector multiplication implemented.
  • Using generic datatype for SIMD instructions, thereby enabling generic SIMD algorithms, e.g. for BEM kernels, and fast adoptation of new SIMD instructions, e.g. AVX2.
  • Removed TVirtualVector (replaced by TScalarVector).
  • and, as usual: several bugs fixed

v2.0

v2.0.2 (released 2014-01-24)

  • fixed race condition in C bindings
  • fixed issue with initialisation of static variables

v2.0.1 (released 2013-12-05)

  • fixed some bugs

v2.0 (released 2013-09-18)

  • Major Changes
    • Switched from OpenMP to Threading Building Blocks as interface to shared memory parallelism, thereby also changing most algorithms to task-based parallelism.
    • Reducing dependency on external libraries by using C++11 features. Also replacing some classes by default C++ versions (finally removing old code).
    • Alternative, non-recursive, level-wise ℋ-LU factorisation based on explicit block dependencies, which provides far better speedup on many-core systems, e.g. Intel MIC architecture.
    • New ℋ-LU factorisation algorithm also applicable in distributed environments, yielding better load-balancing (albeit with limited speedup).
    • Added support for multiple CPUs to many algorithms, e.g. in clustering, norm computations, matrix-vector multiplication and solves, ℋ²-convertion.
  • Minor Changes
    • Optimised BEM kernels for Intel MIC architecture.
    • Introduced TLinearOperator for operators not supporting TMatrix functionality, e.g. factorised matrices.
    • HLIBpro file format changed due to internal changes and due to some bugs in the format. However, backward read compatibility for most files written with earlier versions is kept.
    • Added Support for Cairo library, thereby providing PDF output.
  • And of course: many smaller feature upgrades and bug fixes.

v1.2 (released 2012-02-23)

  • Matrix Construction:
    • Switched to template based coefficient functions (TCoeffFn and derived) and all depended classes, e.g. TDenseMBuilder, SVD and ACA low rank approximation.
    • Rewrote HCA:
      • Simpler interface containing all neccessary functionality in single class.
      • Using template for value type.
      • Added base classes for permuted indices and for BEM applications using quadrature.
      • Added implementation for Laplace and Helmholtz also for linear ansatz spaces and with support for SSE2 and AVX.
    • Cleaned up ACA implementation.
    • Changed handling of recompression: should now be handled by default for low rank approximation algorithm and not by matrix construction class (to avoid recompression of optimal results).
  • Clustering Changes:
    • Added TNDBSPPartStrat to be used in connection with nested dissection (trying various clusterings and choosing best for ND).
    • Modified TNDBSPCTBuilder to more resemble algebraic version, e.g. average depth for interface clusters instead of maximal.
    • Fixed bug in PCA based clustering and added version for cardinality based clustering.
    • Added various flags to modify clustering, e.g. synchronisation of interface depth, enforcing block clusters with same depth of corresponding clusters, using symmetrised weights in algebraic clustering.
  • Input/Output and visualisation:
    • Fixed bug in reading dense matrices.
    • Changed order of dimension for coordinate IO using Matlab format: now ncoord × dimension (e.g. as also used by Sparse Matrix Collection).
    • Added VTK visualisation for coordinates (with various options, e.g. marking clusters or index connectivity) and BEM grids.
    • Added Output of Grids in HLIB format.
    • Added coordinate IO in MatrixMarket format.
  • Changes in LAPACK wrapper:
    • added LAPACK workspace queries for optimal workspace size instead of using predefined block size
    • using xGESDD for large matrices
  • various bug fixes.

v1.1

v1.1.1 (released 2011-11-29)

  • Deactivated default coarsening during matrix construction.
  • Added special H² matrix builder with predefined cluster bases.

v1.1 (released 2011-11-28)

  • changes in BEM code:
    • Added support for AVX.
    • Performance speedups in SSE2 implementation of Helmholtz and Maxwell kernels.
    • Runtime detection of SSE2/AVX availability and automatic choice of optimal kernel.
  • Added matrix_format function to matrix coefficient functions to define whether unsymmetric, symmetric or hermitian (default: unsymmetric).
    Default build function in matrix builders now without format argument.
  • Added support for ILP64 BLAS/LAPACK implementations (64bit integers).
  • Added support for AMD-LibM (integrated in binary Linux distributions).
  • Added vector IO in MatrixMarket format.
  • Cleaned up C++ examples (thereby also removing Boost link dependency).
  • several bug fixes

v1.0

v1.0.1 (released 2011-09-28)

  • OpenMP exception handling changed: now all threads will stop as soon as possible in case of an error
  • fixed several, previously undetected, non-critical compiler warnings (MS Visual C++)
  • bug fixes

v1.0 (released 2011-07-01)

HLIBpro v1.0 is a major rewrite/reorganisation of many of the 𝓗-matrix algorithms.
The following list of changes only covers the main topics and is by far not complete.

  • added distributed computing via MPI for matrix construction and factorisation
  • added 𝓗²-matrices
  • added internal multi-level graph partitioning for blackbox clustering
  • added support for piecewise linear basis functions and Maxwell EFIE/MFIE
  • rewrote interface to BLAS/LAPACK
  • rewrote C interface with better mapping of internal C++ and C types
  • increased robustness of matrix factorisation in case of bad-conditioned matrices
  • increased speedup of matrix factorisation in multi-threaded computations
  • many performance improvements and bug fixes

v0.13

v0.13.6 (released 2009-01-30)

  • added optional diagonal scaling of 𝓗-matrices during LU factorisation
  • added blockwise accuracy, e.g. accuracy depending on current matrix block
  • rewrote accuracy handling in C bindings
  • simplified BSP partitioning methods and added regular cardinality based and principle component based clustering
  • added optional balancing of tree depth in cluster tree construction with predefined partitioning
  • implemented optional double precision computation of matrix inversion and low-rank truncation in single precision mode
  • fixed bug in calling single precision norm functions of LAPACK
  • fixed bug in PostScript output and modified 𝓗-matrix output in PostScript format
  • added support for Jacobi based SVD (sgejsv and dgejsv) in LAPACK v3.2

v0.13.5 (released 2008-09-23)

  • removed ID based cluster tree computations in matrices
  • always computing SCC in algebraic clustering, also in nested dissection clustering
  • reordering clusters depending on size ratio (large first)
  • fixed bug with filenames without directories
  • fixed non-exception safe OpenMP usage
  • added matrix reduction to nearfield part
  • added dense low-rank multiplication if result is large dense matrix

v0.13.4 (released 2008-04-24)

  • fixed solve functions in TLU, TLDL (checking for NULL blocks)
  • fixed OpenMP call with zero threads in TLU TLDL and TMatrixInv
  • fixed operator = in autoptr (wrong const)
  • removed unnecessary checks in TArray::copy
  • fixed recursive call in restrict_blockdiag
  • replaced fixed constants by type dependent constants in lapack.cc
  • fixed TMatrixInv::multiply_diag when only D is dense

v0.13.3 (released 2008-03-27)

  • fixed several warnings from Visual C++ and Intel C++ compilers
  • moved all global variables and functions into HLIB namespace (except xerbla override)
  • enabled user defined prefix for functions and types in C interface and added override for namespace name
  • reactivated cardinality check when using HLIB_BSP_AUTO

v0.13.2 (released 2008-02-29)

  • replaced threads and mutices by OpenMP (thread start only, no scheduling)
  • included log file support in addition to stdout
  • added parallel LDLT factorisation (DD and blockdiag only)
  • added parallel blockdiag LU factorisation
  • added zero approximation during matrix construction (for nearfield only)
  • fixed bug in algebraic nested dissection clustering (wrong path length in interface)

v0.13.1 (released 2008-02-04)

  • reduced memory consumption/fragmentation in ACA generated matrices with large rank
  • added Fiduccia/Mattheyses bisection optimisation for BFS clustering
  • added FFT for vectors by implementing support for FFTW3 (optional)
  • fixed bug in TBSPPartCTBuilder when using more than two partitions
  • fixed potential issues in sorting algorithms
  • fixed type issues with *_bytesize functions in C interface
  • fixed bug in PostScript visualisation of matrices if matrix norm is zero
  • fixed issues with GCC-4.3
  • fixed bug in command line parsing of configuration system
  • minor modifications to SCons system to increase userfriendliness

v0.13(.0) (released 2007-12-19)

  • general Algorithmic Changes
    • support for single precision arithmetic; has to be decided before compiling HLIBpro
    • made complete C++ functions and classes visible from outside instead of just C interface functions
    • rewrote complex arithmetic to distinguish between symmetric and hermitian matrices; added LDLH and LLH factorisations
    • inversion now based on LU, thereby reducing memory consumption (roughly halved)
    • added computation of the diagonal of the inverse without computing the inverse
    • added evaluation of LU, LDLT factorisations (instead of just solving)
    • removed point-wise LU and LDLT factorisation (only blocked) to improve robustness with zeroes on diagonal
    • added (optional) check and fix for singular sub matrices during inversion and factorisation
    • added complex valued HCA
    • new version of ACA+
    • multiplication C = ADB with diagonal D implemented
    • implemented bilinear forms for Helmholtz single and double layer potential
    • implemented bilinear form for acoustic scattering
    • rewrote algebraic clustering for sparse matrices; added support for Scotch and CHACO
    • added support for periodic coordinates in clustering
    • added clustering with user defined index partition on first level in cluster tree
    • added standard admissibility for algebraic clustering
    • added maximal level in clustering to prevent infinite recursion
    • modified solvers to handle complex valued data
    • added permutation of dense matrices without temporary storage (needed in IO)
  • parallel Arithmetic
    • added thread parallel algorithms for matrix construction, matrix multiplication, inversion and LU factorisation
    • redesigned thread pool, thereby fixing race conditions
    • added support for Windows threads
    • fixed several issues with thread safety
  • Input and Output
    • added general I/O functions with autodetection of file format
    • added output of matrices in Harwell/Boeing format
    • added MatrixMarket format
    • added support for Ply and surface mesh format (NetGen) for Grid I/O
    • fixed format errors in SAMG output
    • conversion of arbitrary matrices to sparse format when writing in SAMG or Harwell/Boeing format
    • fixed support for symmetric matrices in Harwell/Boeing format
  • C interface
    • prefixed all functions, types and constants with hlib_ (or HLIB_) to prevent collisions with other definitions (OS or libraries)
    • added support for C99 complex types (if available)
    • added hlib_set_coarsening to activate/deactivate coarsening during matrix construction (default: on) and matrix arithmetic (default: off)
    • added hlib_matrix_inv_diag to return diagonal of inverse
    • added hlib_matrix_is_complex to test for real or complex valued matrices
    • added hlib_set_nthreads to set number of threads
    • added hlib_coord_t as special type for coordinates
    • separated stop criterion and solver in solver interface
  • Miscellaneous
    • updated CPUflags and Rmalloc
    • fixed optimisation issues (leading to infinite loops) in enclosed CLAPACK

v0.12 (released 2006-11-01)

  • Algorithmic Changes
    • added (blocked) LDLT factorisation (now default for symmetric matrices)
    • no longer need extra matrix in matrix inversion
    • using ACAFull in HCA (instead of SVD)
    • adaptively choosing quadrature and interpolation order in ACA and HCA
    • rewrote matrix addition to support general cases, e.g. low-rank to blocked
    • rewrote low-rank truncation handling
    • support for METIS in algebraic clustering routines
    • added basic support for "dense" sparse matrices, e.g. with highly coupled indices
    • added SSE2 based HCA algorithm
    • added infinity norm for vectors
    • using norm of preconditioned residual for all solvers if preconditioner is present
    • added MINRES iteration
    • using ADM_AUTO as default admissibility
    • finally removed all asserts and replaced by internal error checking
  • Input and Output
    • VRML97 support
    • added Matlab compression (Matlab v7) and structs support
    • support for Harwell-Boeing matrix format (read-only)
    • modified PostScript output of block-wise SVD; now scaled w.r.t. 2-norm of matrix
  • OS and Library support
    • MS Windows support
    • shared libraries for Linux and Windows
    • changed configure system to better handle MS Windows environment
    • added internal xerbla to handle LAPACK errors directly
  • C interface
    • automatic choice of matrix building in hlib_matrix_build_bem_grid
    • introduced vector_t as type to vectors (no more C arrays)
    • added Gauss and Sauter triangle quadrature rules
    • added functions to access matrix and vector entries
    • added copyto and copyto_eps functions
    • added hlib_matrix_build_dense to build 𝓗-matrix from dense matrix
    • changed solver management
  • Miscellaneous
    • several improvements and bug fixes
    • cleaned up error codes
    • updated CPUflags and Rmalloc

v0.11 (released 2006-05-29)

  • Arithmetic
    • added ACA-Full
    • added HCA (hybrid cross approximation)
    • complex valued ACA and SVD
    • added copy with coarsening for 𝓗-matrices
    • added computation of spectral norm for the inverse of a matrix
    • support for permutations in matrix-vector multiplication of sparse matrices
    • added support for Laplace SLP/DLP and 3D triangle surface grids
    • fixed issues with degenerated bounding boxes in geometrical clustering
  • Input/Output
    • support for PLTMG matrix format
  • Miscellaneous
    • replaced error handling with exceptions
    • added modified CLAPACK as default implementation of LAPACK to HLIBpro
    • integrated CPUFlags into configure system
    • added function for fast reciprocal square root

v0.10 (released 2006-04-05)

  • Arithmetic
    • initial support for complex arithmetic
    • support for symmetric matrices in arithmetic
    • implemented block LU factorisation
    • implemented LDLT factorisation
    • added Frobenius norm for sparse matrices
    • support for CRS format in sparse matrices
    • added Jacobi and SOR matrix types (for matrix-vector multiplication)
    • implemented hierarchical domain decomposition with parallel arithmetics
  • Parallel Algorithms
    • thread-parallel Cholesky factorisation
    • thread-parallel coarsening of 𝓗-matrices
    • fixed thread-parallel LU and inversion
    • fixed dead-locks in thread-pool
    • added direct communication in BSP mode
    • parallel addition of matrices and vectors via streams
  • Input/Output
    • support for Matlab and SAMG format
  • Miscellaneous
    • introduced C interface functions and types
    • added configure system for Makefiles
    • added progress meter support for arithmetic
    • added internal RTTI system
    • support for memory consumption query on HP-UX
    • rewrote error handling

v0.9 (released 2004-11-30)

  • first public version as PHI (Parallel H-matrix Implementation)
  • merged BSP-parallel and thread-parallel versions of 𝓗-matrix library