support different compositions with particles #5963

tjhei · 2024-07-10T10:37:55Z

This is a first version of particle support with different kinds of compositional fields.

tjhei · 2024-07-12T00:58:11Z

@gassmoeller This seems to be working now if you want to take a first look. I still need to add some tests and verify if everything works correctly.

I left two code paths in here (see line 881 "if (false)") to switch between grouping FEs that are the same (like you do on main right now) vs not doing it. Do you have a suggestion how I can test the performance difference between the two? My current test sees no measurable difference, but I am not sure I am setting this up in a sensible way.

gassmoeller · 2024-07-12T07:48:48Z

Do you have a suggestion how I can test the performance difference between the two? My current test sees no measurable difference, but I am not sure I am setting this up in a sensible way.

You need a test that has plenty of particles per cell and many particle properties that require dynamic updates during the run (the particle advection only uses the velocity evaluator, so you wont see a difference in the Particles: advect timing, only Particles: update properties). Maybe benchmark/integrated_strain/(pure_shear and or simple_shear) would be a good testcase. You would have to increase the number of particles (it currently only creates 1), but they have a particle property that requires updates, but is relatively cheap to compute, so the solution evaluation should stand out as one of the major costs of the particle update process.

gassmoeller

Very cool, I am looking forward to using this. I left some comments, mostly I forgot if we group composition base elements of the same type that are not consecutive. If we do I think your implementation here likely doesnt work (it evaluates n consecutive components as far as I can see, not n components). If we do not it should be fine.

gassmoeller · 2024-07-12T11:33:55Z

source/particle/world.cc

@@ -627,7 +776,7 @@ namespace aspect
      // we create this derived class with an additional template. This makes it possible
      // to access the functionality through the base class, but create an object of this
      // derived class with the correct number of components at runtime.
-      template <int dim, int n_compositional_fields>
+      template <int dim>
      class SolutionEvaluatorsImplementation: public SolutionEvaluators<dim>


If we do no longer need this template can we remove the distinction SolutionEvaluators/SolutionEvaluatorsImplementation again? That would simplify the code somewhat.

Yes, this could go. Do you want to review this in a single large PR or separately?

gassmoeller · 2024-07-12T11:35:48Z

source/particle/world.cc

+          DynamicFEPointEvaluationImpl(NonMatching::MappingInfo<dim> &mapping,
+                                       const FiniteElement<dim> &fe,
+                                       const unsigned int        first_selected_component)
+            : DynamicFEPointEvaluation<dim>(first_selected_component, n_components),


What do you do if the n_components are not consecutive? FEPointEvaluation only assumes consecutive components, but as far as I remember we now allow merging the base elements of non-consecutive components?

This only works for consecutive items (and this is what introspection returns). The only place where we do magic with non-consecutive items is with matrix block locations.

source/particle/world.cc

gassmoeller · 2024-07-12T11:37:55Z

source/particle/world.cc

+      template <int dim>
+      Tensor<1,dim> to_tensor(const Tensor<1,dim> &in)
+      {
+        return in;
+      }


I dont understand the necessity for to_tensorx functions. At first I thought you use them to create a non-const copy, but in the place they are used below they are not modified. Can you explain and document?

Edit: Oh, do you need the specialization for dim == 1 because the type is different? That would be something to document.

Exactly. This is documented now.

gassmoeller · 2024-07-12T11:41:52Z

source/particle/world.cc

+              result[c] = x[c];
+            return result;
+          }
+


check empty lines between functions. also maybe some documentation?

source/particle/world.cc

gassmoeller · 2024-07-12T11:45:02Z

source/particle/world.cc

-                                                                    j);
+        for (const auto &eval : compositions)
+          {
+            const std::vector<double> values = eval->get_value(evaluation_point);


Can we avoid the copy and maybe use a type that avoids the memory allocation altogether (e.g. boost::container::small_vector)?

This should not allocate (return value optimization, as we return a copy from a locally constructed temporary in that function). I think a bigger optimization would be to do all quadrature points = particles at once. Is there a reason you evaluate them separately?

(but we obviously allocate the std::vector inside that function)

I think a bigger optimization would be to do all quadrature points = particles at once. Is there a reason you evaluate them separately?

Why would that be faster? I thought get_value only returns the value that was computed earlier in evaluate (which is called in SolutionEvaluatorsImplementation::reinit)

source/particle/world.cc

gassmoeller · 2024-07-15T12:29:02Z

source/particle/world.cc

+          }
+        else
+          {
+            // We consider each group of consecutive compositions of the same time together. This is because it is more efficient


Suggested change

// We consider each group of consecutive compositions of the same time together. This is because it is more efficient

// We consider each group of consecutive compositions of the same degree together. This is because it is more efficient

gassmoeller

Yes, this could go. Do you want to review this in a single large PR or separately?

Let's do it separately.

I think this is almost good to go, but can you add a test?

tjhei · 2024-07-15T18:36:20Z

I think this is almost good to go

Close, but I still need to address a few points.

I just did a quick benchmark (4 fields, 1M particles) with the following conclusions:

New batched:    117.7 B cycles
New unbatched:  119.1 B cycles
current main:   115.8 B cycles
(experimental:  115.7 B cycles)

where "particle: update properties" is 4.2% of the total runtime. Experimental is me trying to optimize the current version by:

grouping over all quadrature points together
filling the data in-place instead of returning a std::vector

This a bit messy, but I am basically replacing:

evaluators.reinit(..)
for (unsigned int i = 0; i<n_particles_in_cell; ++i)
  evaluators.get_solution(i, solution[i]);

by

evaluators.reinit(..)
evaluators.get_solutions(solution);

and yes, this is doable as reinit() initializes values with all quadrature points at once.

tjhei · 2024-07-15T19:37:45Z

In other words, this PR is measurably slower, maybe up to 25%, (either because of the virtual function calls per particle or the temporary std::vectors).

tjhei added the work in progress label Jul 10, 2024

tjhei force-pushed the composition-particles branch 2 times, most recently from 5df5cff to 686a968 Compare July 11, 2024 23:42

composition particles

9a73f69

tjhei force-pushed the composition-particles branch from 686a968 to 9a73f69 Compare July 12, 2024 00:19

tjhei added 2 commits July 11, 2024 20:21

cleanup

d17a6e1

group fields

356d968

gassmoeller reviewed Jul 12, 2024

View reviewed changes

tjhei force-pushed the composition-particles branch from a3a3a5e to 15f4d10 Compare July 14, 2024 21:02

gassmoeller reviewed Jul 15, 2024

View reviewed changes

address comments

d679a61

tjhei force-pushed the composition-particles branch from 15f4d10 to d679a61 Compare July 15, 2024 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support different compositions with particles #5963

support different compositions with particles #5963

tjhei commented Jul 10, 2024

tjhei commented Jul 12, 2024

gassmoeller commented Jul 12, 2024

gassmoeller left a comment

gassmoeller Jul 12, 2024

tjhei Jul 14, 2024

gassmoeller Jul 12, 2024

tjhei Jul 14, 2024

gassmoeller Jul 12, 2024

tjhei Jul 14, 2024

gassmoeller Jul 12, 2024

gassmoeller Jul 12, 2024

tjhei Jul 14, 2024

tjhei Jul 14, 2024

gassmoeller Jul 15, 2024

gassmoeller Jul 15, 2024

gassmoeller left a comment

tjhei commented Jul 15, 2024

tjhei commented Jul 15, 2024

	// We consider each group of consecutive compositions of the same time together. This is because it is more efficient
	// We consider each group of consecutive compositions of the same degree together. This is because it is more efficient

support different compositions with particles #5963

Are you sure you want to change the base?

support different compositions with particles #5963

Conversation

tjhei commented Jul 10, 2024

tjhei commented Jul 12, 2024

gassmoeller commented Jul 12, 2024

gassmoeller left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gassmoeller left a comment

Choose a reason for hiding this comment

tjhei commented Jul 15, 2024

tjhei commented Jul 15, 2024