Skip to content

Fortran Inverse Matrix - Strumpack Dense Decaying Accuracy after n x n 7000, and Strumpack Sparse Speed to Strumpack Dense Solver #129

@realbabilu

Description

@realbabilu

Hi
I just trying strumpack for inverse matrix in Windows. it worked. just a hiccups in libgk building. but i see some problems later.
i saw the examples wrapper src\fortran\strumpack.f90 and src\structured\fortran\strumpack_dense.f90.
i built strumpack library wtih gcc mingw32 native on windows (equation.com) with libmets and libgk, openmp version, nompi, and openblas (later demoed on win10 with intel oneapi).
i created with nvdemo_strumpacksparse.f90 (here attached as txt) for dense solver and invdemo_strumpack.f90 (here attached as txt) for sparse solver.

invdemo_strumpacksparse.f90.txt
invdemo_strumpackdense.f90.txt

i have compiled with

gcc strumpack_dense.f90 -c -Ic:\gcc\libstrumpack\include
gcc strumpack.f90 -c -Ic:\gcc\libstrumpack\include
gfortran invdemo_strumpacksparse.f90 strumpack_dense.o strumpack.o libstrumpack.a -lstdc++ -fopenmp libmetis.a libgk.a libopenblas.a -fopenmp -o strumpack_sparse.exe
gfortran invdemo_strumpacksdense.f90 strumpack_dense.o strumpack.o libstrumpack.a -lstdc++ -fopenmp libmetis.a libgk.a libopenblas.a -fopenmp -o strumpack_dense.exe

Since dense quite fast, i create dual inversion check, invers and reinverse the result, and compare to the original matrix.
while the sparse, i only compare with inverse lapack result for single inverse.

My observation so far.

  1. Dense speed over sparse with 1000x1000 sparsity 90%, dense done it in 0.6s, while sparse done it 6s. I can saw the sparse only get 2 threads in windows taskmanager , eventhough said 16 openmp threads is used. Why the sparse solver can not faster than dense. is it my lib compile set wrong?
  2. When using sparse solver, if it dense, it is actually run smoothly. But when the sparsity comes in, i set around 0.3 for example, without set the matching none, it will get halt in ordering or segmentation fault, eventhough i changed into AMD ordering instead METIS ordering. STRUMPACK_MATCHING_MAX_SMALLEST_DIAGONAL is a must or STRUMPACK_MATCHING_MAX_DIAGONAL_PRODUCT_SCALING for sparse matrix. STRUMPACK_BLR will get some speed than STRUMPACK_HSS. but still a wide margin with dense solver for solving sparse matrix using openmp.
  3. Dense solver is fast. but around 1000, i need to get tolerance higher, i set to 1.d-12, but in n= 7000 it create mode accuracy error. even if i set tolerance 1.0d-15 it cant help the accuracy.
ierr = SP_d_struct_from_dense(S, n, n, T(1,1), n, options)
options%verbose = 0
options%rel_tol = 1.0d-12
options%abs_tol = 1.0d-12

here the result of dense for n = 7000.
it is normal for n = 4000.

C:\temp\strumpack>invdemo_strumpackdense.exe
 Enter the size of the square matrix (n):
7000
 Enter the sparsity level (0.0 to 1.0, where 1.0 is completely sparse):
0.9
 === Matrix Statistics ===
  Matrix size (n x n):                7000
  Requested sparsity:               0.9000
  Achieved sparsity:                0.8999
  Number of non-zero elements:     4905992
  Number of zero elements:        44094008
  Percentage of non-zeros:           10.01
  Percentage of zeros:               89.99
 ================================
 Original Dense Matrix T top 4x4:
   33.668965    0.000000    0.000000   74.283103
    0.000000    0.989305    0.000000    0.000000
    0.000000    0.000000    0.097195    0.000000
    0.000000    0.000000    0.000000   31.626349
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 First inverse Matrix Tinv top 4x4:
    0.001657    0.000213   -0.000318   -0.002225
   -0.002024    0.000494   -0.000465    0.003381
   -0.000697   -0.000297    0.000604    0.001133
   -0.000182    0.000202    0.000451    0.001643
 STRUMPACK first inverse in    74.449000120162964       s
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 Second inverse Matrix Tinv top 4x4:
   33.668965    0.000000    0.000000   74.283103
    0.000000    0.989305    0.000000    0.000000
    0.000000    0.000000    0.097195   -0.000000
   -0.000000    0.000000   -0.000000   31.626349
 STRUMPACK second inverse in    72.141999959945679       s
 Max error strumpack T - inv(inv(T)) =    28.205894119440590
 first inverse
 First inverse Matrix Lapack top 4x4:
    0.001650    0.000213   -0.000320   -0.002225
   -0.002041    0.000472   -0.000464    0.003418
   -0.000703   -0.000305    0.000605    0.001147
   -0.000157    0.000217    0.000455    0.001620
 second inverse
 Second inverse Matrix Lapack top 4x4:
   33.668965   -0.000000    0.000000   74.283103
    0.000000    0.989305    0.000000   -0.000000
    0.000000    0.000000    0.097195   -0.000000
    0.000000    0.000000    0.000000   31.626349
 Max error lapack T - inv(inv(T)) =    2.3274981941767692E-009`
C:\temp\strumpack>invdemo_strumpackdense.exe
 Enter the size of the square matrix (n):
4000
 Enter the sparsity level (0.0 to 1.0, where 1.0 is completely sparse):
0.9
 === Matrix Statistics ===
  Matrix size (n x n):                4000
  Requested sparsity:               0.9000
  Achieved sparsity:                0.8998
  Number of non-zero elements:     1602502
  Number of zero elements:        14397498
  Percentage of non-zeros:           10.02
  Percentage of zeros:               89.98
 ================================
 Original Dense Matrix T top 4x4:
    0.690246   28.653805    0.000000    0.000000
    0.000000   35.140132    0.000000    0.000000
    0.000000    0.000000    0.820807    0.000000
   26.619199    0.000000   53.660974    0.578378
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 First inverse Matrix Tinv top 4x4:
   -0.011726    0.002219   -0.010302    0.005187
   -0.005151    0.001073   -0.006667    0.002209
    0.003418   -0.000376    0.003219   -0.001568
   -0.002323    0.000339   -0.000954    0.001469
 STRUMPACK first inverse in    15.020999908447266       s
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 Second inverse Matrix Tinv top 4x4:
    0.690246   28.653805    0.000000   -0.000000
    0.000000   35.140132   -0.000000   -0.000000
    0.000000    0.000000    0.820807   -0.000000
   26.619199   -0.000000   53.660974    0.578378
 STRUMPACK second inverse in    14.835000038146973       s
 Max error strumpack T - inv(inv(T)) =    1.5499991406142130E-009
 first inverse
 First inverse Matrix Lapack top 4x4:
   -0.011726    0.002219   -0.010302    0.005187
   -0.005151    0.001073   -0.006667    0.002209
    0.003418   -0.000376    0.003219   -0.001568
   -0.002323    0.000339   -0.000954    0.001469
 second inverse
 Second inverse Matrix Lapack top 4x4:
    0.690246   28.653805    0.000000   -0.000000
    0.000000   35.140132    0.000000   -0.000000
   -0.000000   -0.000000    0.820807   -0.000000
   26.619199   -0.000000   53.660974    0.578378
 Max error lapack T - inv(inv(T)) =    2.0820268543062612E-009

Created another version to detect anomaly. now it get 10000 for wrong answer.

C:\temp\strumpack>1
 Enter the size of the square matrix (n):
8000
 Enter the sparsity level (0.0 to 1.0, where 1.0 is completely sparse):
0.9
 === Matrix Statistics ===
  Matrix size (n x n):                8000
  Requested sparsity:               0.9000
  Achieved sparsity:                0.9000
  Number of non-zero elements:     6403043
  Number of zero elements:        57596957
  Percentage of non-zeros:           10.00
  Percentage of zeros:               90.00
 ================================
 Original Dense Matrix T top 4x4:
    0.355491    0.000000    0.000000    0.000000
    0.000000    0.245901    0.000000    0.000000
    0.000000    0.000000    0.342192    0.000000
   58.534653    0.000000    0.000000    2.929903
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 First inverse Matrix Tinv top 4x4:
   -0.000388   -0.000324   -0.000103    0.000329
    0.001289   -0.001696    0.001533    0.000335
   -0.000115   -0.000217    0.000093    0.000104
   -0.000728    0.001794   -0.001793    0.000686
 STRUMPACK first inverse in    91.796999931335449       s
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 Second inverse Matrix Tinv top 4x4:
    0.355491   -0.000000    0.000000    0.000000
   -0.000000    0.245901    0.000000   -0.000000
   -0.000000    0.000000    0.342192    0.000000
   58.534653   -0.000000   -0.000000    2.929903
 STRUMPACK second inverse in    105.96200013160706       s
 Max error strumpack T - inv(inv(T)) =    2.9291818842130105E-009
 Where is it?
   Location of max error (STRUMPACK): (        6639 ,        1670 ) with error =    2.9291818842130105E-009
  T at location   0.0000000000000000
  Strumpack Ainvinv at location   2.9291818842130105E-009
 first inverse
 First inverse Matrix Lapack top 4x4:
   -0.000388   -0.000324   -0.000103    0.000329
    0.001289   -0.001696    0.001533    0.000335
   -0.000115   -0.000217    0.000093    0.000104
   -0.000728    0.001794   -0.001793    0.000686
 second inverse
 Second inverse Matrix Lapack top 4x4:
    0.355491    0.000000    0.000000    0.000000
    0.000000    0.245901    0.000000   -0.000000
    0.000000   -0.000000    0.342192   -0.000000
   58.534653    0.000000    0.000000    2.929903
 Max error lapack T - inv(inv(T)) =    3.1295925850827189E-009
   Location of max error (LAPACK): (        6415 ,        3777 ) with error =    3.1295925850827189E-009

C:\temp\strumpack>1
 Enter the size of the square matrix (n):
10000
 Enter the sparsity level (0.0 to 1.0, where 1.0 is completely sparse):
0.9
 === Matrix Statistics ===
  Matrix size (n x n):               10000
  Requested sparsity:               0.9000
  Achieved sparsity:                0.8999
  Number of non-zero elements:    10009575
  Number of zero elements:        89990425
  Percentage of non-zeros:           10.01
  Percentage of zeros:               89.99
 ================================
 Original Dense Matrix T top 4x4:
    0.730090    0.000000    0.000000   30.969471
    0.000000    0.117732    0.000000    0.000000
    0.000000    0.000000    0.769833    0.000000
   61.726215    0.000000    0.000000    0.528951
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 First inverse Matrix Tinv top 4x4:
    0.001450   -0.000544    0.000523    0.001219
    0.000629   -0.000173    0.000012    0.000385
   -0.001477    0.000598   -0.000111   -0.001314
    0.000273   -0.000067    0.000545   -0.000053
 STRUMPACK first inverse in    213.92899990081787       s
# WARNING: HSS construction uses O(N^2) algorithm based on random projection.
# Consider passing geometry info in argument p, or use construct_partially_matrix_free.
 Second inverse Matrix Tinv top 4x4:
    0.730090   -0.000000   -0.000000   30.969471
    0.000000    0.117732   -0.000000   -0.000000
    0.000000   -0.000000    0.769833    0.000000
   61.726215   -0.000000    0.000000    0.528951
 STRUMPACK second inverse in    219.53099989891052       s
 Max error strumpack T - inv(inv(T)) =    92.755430539481750
 Where is it?
   Location of max error (STRUMPACK): (         125 ,        6249 ) with error =    92.755430539481750
  T at location   0.0000000000000000
  Strumpack Ainvinv at location  -92.755430539481750

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions