GSoC Task #2: Implement aggregation method for vector-valued problems #153

GeliezaK · 2024-06-13T09:36:46Z

Meeting notes 2024-06-13:
@oriolcg
Attendees: @fverdugo @amartinhuertas @GeliezaK
Notes:

Implementation of matrix-to-graph-transformation here: muelue, pyAMG, PetSci
AMG for vector-valued problems

Action Points:

perform spell check on tutorial
blog post on LinkedIn with link to tutorial PR
add consistency tests to test_amg.jl
start implementing matrix-to-graph transformation for CSC sparse matrix and PSparse matrix

fverdugo · 2024-06-13T09:44:18Z

In this file there are different ways of computing the "strength" graph of a sparse matrix, also for vector-valued problems:

https://github.com/pyamg/pyamg/blob/e1fe54c93be1029c02ddcf84c2338a607b088703/pyamg/strength.py#L275

GeliezaK · 2024-06-18T09:30:10Z

@fverdugo @oriolcg @amartinhuertas
When I add a new function matrix_graph to the amg.jl file, and try to call it from amg_tests.jl, I get "UndefVarError: matrix_graph not defined". Is there anything else I need to do to make the function available from the test file?

amartinhuertas · 2024-06-18T09:32:30Z

you can either export it from the package or to use package_name.matrix_graph directly

GeliezaK · 2024-06-20T09:36:32Z

Meeting notes from 2024-06-20:
@oriolcg @amartinhuertas

Attendees: @fverdugo @GeliezaK

Action Points:

read notebook SMVP and SMMP
implement sequential matrix_graph without modifying input data

GeliezaK · 2024-06-27T09:16:17Z

Meeting notes from 2024-06-27:

Attendees: @fverdugo @GeliezaK @oriolcg @amartinhuertas

Action Points:

understand implementation of constant prolongator for scalar case
implement vector-valued version of constant prolongator constant_prolongator_with_block_size
tests for constant prolongator
changes for sequential strength graph : avoid push! , change theta to epsilon

fverdugo · 2024-06-28T05:07:28Z

Hi @GeliezaK

I realize that function constant_prolongator_with_block_size is not the easiest way of achieving our goal. An alternative is to transform the node_to_aggregate to dof_to_aggregate (with function below) and pass the result to the existing constant_prolongator.

function add_block_size_to_aggregates(node_to_aggregate,aggregates,block_size)
   naggregates = length(aggregates)
   nnodes = length(node_to_aggregate)
   ndofs = nnodes*block_size
   dof_to_aggregate = zeros(Int32,ndofs)
   for node in 1:nnodes
        aggregate = node_to_aggregate[node]
        for i in 1:block_size
            dof = (node-1)*block_size + i
            dof_to_aggregate[dof] = (aggregate-1)*block_size + i
        end
  end
  dof_to_aggregates, 1:(naggregates*block_size)
end

fverdugo · 2024-06-30T05:48:18Z

Hi @GeliezaK ,

yet another edit. I don't think we need a "constant prolongator" with block size > 1. I ralized this after looking at Algorithm 7, page 46, here:

https://mediatum.ub.tum.de/download/1229321/1229321.pdf

We need to build the "node based" constant prolongator just like now, and implement a tentative prolongater that takes into account the block size and the near null space (Algorithm 7). Somehting like this:

    function coarsen(A,B)
        G = strength_graph(A,block_size)
        diagG = dense_diag(G)
        node_to_aggregate, aggregates = aggregate(G,diagG;epsilon)
        PC = constant_prolongator(node_to_aggregate, aggregates,1)
        P0,Bc = tentative_prolongator_with_block_size(PC,B,block_size) # Algorithm 7
        ...
    end

fverdugo · 2024-06-30T06:08:10Z

NB. When qr is mentioned in Algorithm 7, it is really a "thin" QR decomposition. See section for rectangular matrices here: https://en.wikipedia.org/wiki/QR_decomposition

This discussion is relevant on how to compute the thin qr in Julia

JuliaLang/LinearAlgebra.jl#529

GeliezaK · 2024-07-01T14:41:33Z

Hi @fverdugo ,
Thank you for your comments! From Algorithm 7 I understand that the information in node_to_aggregate would be sufficient to compute the tentative prolongator. Why is the constant prolongator needed?
Besides, do you have an idea how to circumvent the QR decomposition if the number of rows of the near null space B is smaller than its number of columns (see Remarks 3.4.2 and 6.3.2 in https://mediatum.ub.tum.de/download/1229321/1229321.pdf) ?

fverdugo · 2024-07-03T08:32:06Z

Thank you for your comments! From Algorithm 7 I understand that the information in node_to_aggregate would be sufficient to compute the tentative prolongator.

You are correct, but note that you cannot use node_to_aggregate directly. This vector gives you for each node the corresponding aggregate. You need the inverse map: for an aggregate you want the list of nodes in this aggregate. This info is encoded in the CSC storage of the constant_prolongator. In any case, What we actually want is this other function than returns a JaggedArray (a vector of vectors) instead of the constant_prolongator: Note that the code is almost identical than for the constant_prolongator

function collect_nodes_in_aggregate(node_to_aggregate,aggregates)
    typeof_aggregate = eltype(node_to_aggregate)
    nnodes = length(node_to_aggregate)
    pending = typeof_aggregate(0)
    isolated = typeof_aggregate(-1)
    nnodes = length(node_to_aggregate)
    naggregates = length(aggregates)
    aggregate_to_nodes_ptrs = zeros(Int,naggregates+1)
    for node in 1:nnodes
        agg = node_to_aggregate[node]
        if agg == pending
            continue
        end
        aggregate_to_nodes_ptrs[agg+1] += 1
    end
    length_to_ptrs!(aggregate_to_nodes_ptrs)
    ndata = aggregate_to_nodes_ptrs[end]-1
    aggregate_to_nodes_data = zeros(Int,ndata)
    for node in 1:nnodes
        agg = node_to_aggregate[node]
        if agg == pending
            continue
        end
        p = aggregate_to_nodes_ptrs[agg]
        aggregate_to_nodes_data[p] = node
        aggregate_to_nodes_ptrs[agg] += 1
    end
    rewind_ptrs!(aggregate_to_nodes_ptrs)
    aggregate_to_nodes = jagged_array(aggregate_to_nodes_data,aggregate_to_nodes_ptrs)
    aggregate_to_nodes
end

nodes_in_agg = aggregate_to_nodes[agg] is the list of node ids (stored in a vector) corresponding to aggregate with id agg.

fverdugo · 2024-07-03T08:52:06Z

Besides, do you have an idea how to circumvent the QR decomposition if the number of rows of the near null space B is smaller than its number of columns (see Remarks 3.4.2 and 6.3.2 in https://mediatum.ub.tum.de/download/1229321/1229321.pdf) ?

I don't have an answer for this at this moment. We can always merge singleton aggregates with a neighboring one until there are no singleton aggregates left. This will work if the mesh has at least 2 nodes.

GeliezaK · 2024-07-04T09:52:21Z

Meeting notes from 2024-07-04:
@fverdugo
Attendees: @GeliezaK @oriolcg @amartinhuertas

Notes:

null space matrices B should be stored in dense format because null vectors have many nonzero entries
tentative prolongator P is now computed as a blockdiagonal matrix, however the rows need to be permutated again before computing the matrix A for the coarse level

Action Points:

represent B and Bc as dense matrix in tentative prolongator
permutate rows in Pc back to original row ordering in B
check if permutation matrix is orthonormal
test correctness of tentative prolongator using Pc * Bc = B

GeliezaK · 2024-07-11T09:07:26Z

Meeting notes from 2024-07-11:

Attendees: @GeliezaK @oriolcg @amartinhuertas @fverdugo

Action Points:

implement tentative prolongator in more efficient way :

allocate output Pc as sparsematrix with correct sparsity pattern,
fill Pc with values of B,
loop over aggregates to build Bi and compute QR decomposition,
Move values from Q to P0,
Move values from R to Bc. Represent B and Bc as vector of vectors (compare default_nullspace)

ignore singleton node aggregates in tentative prolongator. This affects testing because property P0*Bc = B is no longer valid.
test how many iterations of power method are needed to converge to eigenvectors of 1d laplace matrix. Use zero vector as initial eigenvector guess.

GeliezaK · 2024-07-18T09:43:07Z

Meeting notes from 2024-07-18:
@oriolcg
Attendees: @GeliezaK @amartinhuertas @fverdugo

Action Points:

use new functions laplacian_fdm and laplacian_fem to test tentative prolongator
raise error in tentative prolongator if n_i < n_B
test powm! and spectral radius with parallel matrices
test performance of powm! and spectral radius for different matrix sizes, compare number of iterations needed for convergence
think about parallelization

GeliezaK · 2024-07-25T09:45:57Z

Meeting notes from 2024-07-25:
@oriolcg @amartinhuertas
Attendees: @GeliezaK @fverdugo

Notes:

discussed exemplary parallel implementation, specifically construction of PSparsematrix, of strength_graph

Action Points:

implement strength_graph in parallel
implement collect_nodes_to_aggregates in parallel
implement tentative_prolongator_with_blocksize in parallel

fverdugo · 2024-08-15T10:22:46Z

@GeliezaK @amartinhuertas Regarding the bug discussed on the meeting on 2024-08-15.

This works for me with PartitionedArrays v0.5.2:

using PartitionedArrays
nodes_per_dir = (4,4,4)
parts_per_dir = (2,2,2)
p = prod(parts_per_dir)
ranks = DebugArray(LinearIndices((p,)))
args = linear_elasticity_fem(nodes_per_dir,parts_per_dir,ranks)
A = psparse(args...) |> fetch
x = pones(partition(axes(A,2)))
b = A*x
@assert collect(b) ≈ centralize(A)*collect(x)

In fact the sparse matrix-vector product was already been used in the tests

PartitionedArrays.jl/test/gallery_tests.jl

Line 41 in b39eac0

Y = A*pones(axes(A,2))

My guess is that you are modifying some of the data in partition(axes(A,2)) by mistake in your code.

fverdugo · 2024-08-15T10:23:41Z

@GeliezaK Can you please send your minimal reproducer?

GeliezaK · 2024-08-16T07:48:52Z

This is the minimal reproducer that we discussed yesterday:

parts_per_dir = (2,1)
p = prod(parts_per_dir)
ranks = DebugArray(LinearIndices((p,)))
nodes_per_dir = (4,1) 
args = PartitionedArrays.linear_elasticity_fem(nodes_per_dir,parts_per_dir,ranks)
A = psparse(args...) |> fetch
x = node_coordinates_unit_cube(nodes_per_dir,parts_per_dir, ranks)
B = near_nullspace_linear_elasticity(x, partition(axes(A,2)))
x_exact = pones(partition(axes(A,2)))
b = A*x_exact # error

EDIT: The error does not happen if A is multiplied with x_exact before constructing the nullspace.

fverdugo · 2024-08-16T10:05:14Z

I can reproduce the error. I know hot to fix it.

fverdugo · 2024-08-16T10:15:45Z

Fixed in JuliaLang/julia#166

GeliezaK · 2024-08-16T11:20:09Z

Thank you very much @fverdugo ! It's working now.

GeliezaK · 2024-08-19T13:45:18Z

I would like to implement a kind of allreduce with PartitionedArrays to get in each process an Array global_to_owner like this: [1,1,1,2,2]. Each process can initialize global_to_owner_local with local values: [1,1,1,0,0], [0,0,0,2,2]. Then I would like to use something like this but instead of one destination, I want to send to all processes:

global_to_owner = reduction(.+, global_to_owner_local, init=zeros(Int,n_global_coarse_dofs), destination=rank)

How can I achieve this with PartitionedArrays?

I need to pass global_to_owner to OwnAndGhostIndices in the parallel tentative prolongator. I discovered that passing global_to_owner = nothing results in an error when multiplying with P0 in smoothed_prolongator in the parallel version.

fverdugo · 2024-08-19T13:48:51Z

Take a look how the dof partition is build from a node partition here:

PartitionedArrays.jl/src/gallery.jl

Line 389 in 11ded45

function node_to_dof_partition(node_partition,D)

Maybe you can even use the same code, or just calling the node_to_dof_partition function.

GeliezaK · 2024-08-19T14:18:34Z

Thank you! It works now, I did not realize there was a global_to_owner implementation.

GeliezaK · 2024-08-22T10:55:01Z

There seems to be some kind of bug in the code when calling amg_level_params_linear_elasticity.

This code

level_params = amg_level_params_linear_elasticity(block_size;
    coarsening = PartitionedSolvers.smoothed_aggregation_with_block_size(;)
)
fine_params = amg_fine_params(;level_params)
solver_le = amg(;fine_params)

produces these results when compared with a scalar preconditioner:

Whereas this code

level_params = amg_level_params_linear_elasticity(block_size;
    #coarsening = PartitionedSolvers.smoothed_aggregation_with_block_size(;)
)
fine_params = amg_fine_params(;level_params)
solver_le = amg(;fine_params)

leads to calling the coarsen() function only once, but the setup runtimes are much longer and the convergence looks like this

My debugger unfortunately crashes on VSCode and I can't figure out what is the problem.

fverdugo · 2024-08-22T11:12:10Z

Can you share the entire code? Also how you set up the preconditioner.

The blue line of the first figure looks really good. This is what we want to report.

Francesc

GeliezaK · 2024-08-22T13:20:45Z

This is the entire code @fverdugo :
performance_test_sequential.txt

fverdugo · 2024-08-22T13:51:51Z

It seems that you are using block_size == 1 internally in the first case and block_size == 3 in the second case. Thus, I would say that there is some issue in the new code for block_size !=1.

I would check

That the strength graph or linear elasticity with block size == 3 is the same (or similar) tot he strength graph for laplace on the same grid (same nodes_per_dir, parts_per_dir, etc) with block_size==1.
I also would double check the new tentative propagator.

GeliezaK · 2024-08-22T14:42:45Z

I believe the problem might be that for epsilon=0, the strength graph has 1 in each cell by definition, and consequently the aggregate() function groups all nodes in one aggregate. If epsilon is set to 0.02 for instance, the results resemble the first plot. For block_size == 1, the strength graph computation is skipped.

EDIT:
I changed the default epsilon to 0.01 in the strength_graph, because 0 does not make sense. I added a check that epsilon > 0.

GeliezaK · 2024-08-23T07:48:25Z

Update: it is not necessary to change epsilon>0, it also works with epsilon==0 if in strength_graph, the blocks with norm==0 are ignored (i.e. set to 0). I pushed the code and updated the images in the final PR.

GeliezaK added the gsoc Google Summer of Code label Jun 13, 2024

GeliezaK self-assigned this Jun 13, 2024

GeliezaK closed this as completed Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC Task #2: Implement aggregation method for vector-valued problems #153

GSoC Task #2: Implement aggregation method for vector-valued problems #153

GeliezaK commented Jun 13, 2024 •

edited

Loading

fverdugo commented Jun 13, 2024 •

edited

Loading

GeliezaK commented Jun 18, 2024 •

edited

Loading

amartinhuertas commented Jun 18, 2024

GeliezaK commented Jun 20, 2024 •

edited

Loading

GeliezaK commented Jun 27, 2024 •

edited

Loading

fverdugo commented Jun 28, 2024 •

edited

Loading

fverdugo commented Jun 30, 2024

fverdugo commented Jun 30, 2024

GeliezaK commented Jul 1, 2024

fverdugo commented Jul 3, 2024

fverdugo commented Jul 3, 2024

GeliezaK commented Jul 4, 2024 •

edited

Loading

GeliezaK commented Jul 11, 2024 •

edited

Loading

GeliezaK commented Jul 18, 2024 •

edited

Loading

GeliezaK commented Jul 25, 2024 •

edited

Loading

fverdugo commented Aug 15, 2024

fverdugo commented Aug 15, 2024

GeliezaK commented Aug 16, 2024 •

edited

Loading

fverdugo commented Aug 16, 2024

fverdugo commented Aug 16, 2024

GeliezaK commented Aug 16, 2024

GeliezaK commented Aug 19, 2024

fverdugo commented Aug 19, 2024

GeliezaK commented Aug 19, 2024

GeliezaK commented Aug 22, 2024

fverdugo commented Aug 22, 2024

GeliezaK commented Aug 22, 2024

fverdugo commented Aug 22, 2024

GeliezaK commented Aug 22, 2024 •

edited

Loading

GeliezaK commented Aug 23, 2024

GSoC Task #2: Implement aggregation method for vector-valued problems #153

GSoC Task #2: Implement aggregation method for vector-valued problems #153

Comments

GeliezaK commented Jun 13, 2024 • edited Loading

fverdugo commented Jun 13, 2024 • edited Loading

GeliezaK commented Jun 18, 2024 • edited Loading

amartinhuertas commented Jun 18, 2024

GeliezaK commented Jun 20, 2024 • edited Loading

GeliezaK commented Jun 27, 2024 • edited Loading

fverdugo commented Jun 28, 2024 • edited Loading

fverdugo commented Jun 30, 2024

fverdugo commented Jun 30, 2024

GeliezaK commented Jul 1, 2024

fverdugo commented Jul 3, 2024

fverdugo commented Jul 3, 2024

GeliezaK commented Jul 4, 2024 • edited Loading

GeliezaK commented Jul 11, 2024 • edited Loading

GeliezaK commented Jul 18, 2024 • edited Loading

GeliezaK commented Jul 25, 2024 • edited Loading

fverdugo commented Aug 15, 2024

fverdugo commented Aug 15, 2024

GeliezaK commented Aug 16, 2024 • edited Loading

fverdugo commented Aug 16, 2024

fverdugo commented Aug 16, 2024

GeliezaK commented Aug 16, 2024

GeliezaK commented Aug 19, 2024

fverdugo commented Aug 19, 2024

GeliezaK commented Aug 19, 2024

GeliezaK commented Aug 22, 2024

fverdugo commented Aug 22, 2024

GeliezaK commented Aug 22, 2024

fverdugo commented Aug 22, 2024

GeliezaK commented Aug 22, 2024 • edited Loading

GeliezaK commented Aug 23, 2024

GeliezaK commented Jun 13, 2024 •

edited

Loading

fverdugo commented Jun 13, 2024 •

edited

Loading

GeliezaK commented Jun 18, 2024 •

edited

Loading

GeliezaK commented Jun 20, 2024 •

edited

Loading

GeliezaK commented Jun 27, 2024 •

edited

Loading

fverdugo commented Jun 28, 2024 •

edited

Loading

GeliezaK commented Jul 4, 2024 •

edited

Loading

GeliezaK commented Jul 11, 2024 •

edited

Loading

GeliezaK commented Jul 18, 2024 •

edited

Loading

GeliezaK commented Jul 25, 2024 •

edited

Loading

GeliezaK commented Aug 16, 2024 •

edited

Loading

GeliezaK commented Aug 22, 2024 •

edited

Loading