Skip to content

Conversation

@v4hn
Copy link
Contributor

@v4hn v4hn commented Sep 1, 2021

Here is my recent effort to get the tests proposed in #280 to pass by what I consider correct behavior for Fallbacks.

I also added further tests for various edge-cases.
Please have a look at the set of tests first.

It turned out to be quite a bit more work than I anticipated, especially with Pruning involved...
@j-kuehn you might want to have a look to see what I ended up with. :-)

@v4hn v4hn requested a review from rhaschke September 1, 2021 21:34
@codecov
Copy link

codecov bot commented Sep 1, 2021

Codecov Report

Merging #287 (c4b2004) into master (ed45970) will increase coverage by 0.84%.
The diff coverage is 87.34%.

❗ Current head c4b2004 differs from pull request most recent head f22778c. Consider uploading reports for the commit f22778c to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #287      +/-   ##
==========================================
+ Coverage   52.99%   53.83%   +0.84%     
==========================================
  Files         102      102              
  Lines        7598     7783     +185     
==========================================
+ Hits         4026     4189     +163     
- Misses       3572     3594      +22     
Impacted Files Coverage Δ
core/include/moveit/task_constructor/container.h 90.91% <ø> (-0.75%) ⬇️
core/include/moveit/task_constructor/utils.h 100.00% <ø> (ø)
core/src/container.cpp 70.86% <84.51%> (+1.61%) ⬆️
core/src/storage.cpp 85.04% <89.48%> (+0.08%) ⬆️
core/src/stage.cpp 81.58% <92.86%> (+2.10%) ⬆️
core/include/moveit/task_constructor/container_p.h 95.46% <100.00%> (+1.71%) ⬆️
core/include/moveit/task_constructor/stage_p.h 91.67% <100.00%> (-0.11%) ⬇️
core/include/moveit/task_constructor/storage.h 95.56% <100.00%> (+0.21%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed45970...f22778c. Read the comment docs.

@v4hn v4hn force-pushed the mtc-fallback-reworked branch from dc1c9ca to b016a9a Compare September 2, 2021 06:50
Copy link
Contributor

@rhaschke rhaschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for pushing this forward! I generally agree with the tests.
However, I need to add some test cases to point out the undesired behaviour I noticed (see below).
I pushed some minor cosmetic changes as well.

@v4hn v4hn force-pushed the mtc-fallback-reworked branch 2 times, most recently from c86d799 to 2bac879 Compare September 3, 2021 10:48
@v4hn
Copy link
Contributor Author

v4hn commented Sep 3, 2021

Good to see CI succeed for the first time in a while ;)

@rhaschke
Copy link
Contributor

rhaschke commented Sep 9, 2021

I pushed a cleanup for computeGenerate().

Comment on lines +879 to +940
// This override is deliberately empty.
// The method prunes solution paths when a child failed to find a valid solution for it,
// but in Fallbacks the next child might still yield a successful solution
// Thus pruning must only occur once the last child is exhausted (inside computeFromExternal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I agree with you. We do need a mechanism for a stage to claim that it is exhausted. onNewFailure() doesn't perfectly fit that purpose, because it's called on any failure:
https://github.com/ros-planning/moveit_task_constructor/blob/f51f6eb982f90f3c31d9adfe1fc6e041d467c5bb/core/src/stage.cpp#L140-L143

I think we should rename the function to onExhausted and only call it when the solutions are actually exhausted. For basic Propagator and Connect stages that's always the case if they fail. However, Generator stages and containers will need special handling.
Actually, for Generator stages exhausted is closely related to !canCompute(). Not sure that assumption holds for generator-like containers as well.

I any case, we might need to distinguish between "currently exhausted" and "finally exhausted". For example, MonitoringGenerators might be currently not able to compute something, but later, if they received new input, they can. The same holds for connect-like containers, where we already have seen that we can re-enable previously pruned solution branches.

Given this new semantics of the function, the Fallbacks container can easily check whether child actually exhausted its solutions and thus propagate this info upwards if needed.

Copy link
Contributor

@rhaschke rhaschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function computeFromExternal() only handles propagator stages. Connect-like stages require their own set of functions, because you actually want to consider each pair of (compatible) solutions separately - much like you consider each incoming external state of a propagator child separately now.

@v4hn v4hn force-pushed the mtc-fallback-reworked branch 3 times, most recently from 82d9163 to 6f808af Compare September 16, 2021 12:12
@rhaschke
Copy link
Contributor

rhaschke commented Sep 17, 2021

I pushed a bunch of new tests pointing out that priority propagation from external to internal interfaces of containers doesn't work yet.
Unfortunately, I won't have time next week to progress on this.

@rhaschke
Copy link
Contributor

Sorry for the mess. It's too late...

@rhaschke rhaschke force-pushed the mtc-fallback-reworked branch from 6555a3e to 08e8529 Compare September 17, 2021 10:17
@rhaschke
Copy link
Contributor

rhaschke commented Sep 17, 2021

I couldn't put the topic aside yet and pushed some more commits. Although the current (simple) tests work, the fundamental problem remains that a (fallback) container cannot decide yet whether a child is exhausted on a specific external state (a single one for propagating, a pair for connecting interfaces). If the simple ForwardMockups would get replaced by a SerialContainer that will take several compute() iterations to generate a solution, I suspect that this problem will show up.

However, I'm loosely convinced that we can solve the issue nevertheless: We need to change the semantics of onNewFailure() to onExhausted() such that a child can (and has to) explicitly state that it became exhausted on a specific input state (or pair).
If we can trust children to report that correctly, this info can be escalated to the parents. This might make canCompute() superfluous.

If you want to keep (for now) the current implementation fixing the active pending state, I have this implemented as an alternative as well in my repo: v4hn/moveit_task_constructor@mtc-fallback-reworked...ubi-agni:mtc-fallback-reworked
However, this makes the new tests fail, which essentially all rely on the correct ordering of pending states.

@v4hn
Copy link
Contributor Author

v4hn commented Sep 17, 2021

I couldn't put the topic aside yet

Weren't you the one who said we should coordinate our efforts? 💢
I wrote 230 lines of patches for this branch yesterday afternoon that were not ready for pushing yet and conflict now.

I have a feverish kid sitting on my lap, so I don't the nerves right now to either fixup&polish my local patches or review your commits from the last 17 hours. Let's see whether I can keep my eyes open for long enough this evening to look into it.

@rhaschke
Copy link
Contributor

Sorry to hear that. If it helps you, just force-push your changes. I'm curious what they address. But, as I said, I need to turn to other stuff for next week...

@rhaschke
Copy link
Contributor

I think we should split this PR.

  • There is whole bunch of commits adding tests. I think we agree on those and should commit them, disabling them if needed.
  • There are a few commits working on pruning / un-pruning (1292d3a, 1292d3a, b14acfc). These are the most controversial in my opinion. Would be good if you could provide failing tests for them or a sketch as in Rework Pruning #221 for illustration.
  • commits actually improving the Fallbacks container
  • generic improvements (b14acfc, 8b49d86, ffd884d)

This was referenced Sep 20, 2021
@v4hn v4hn force-pushed the mtc-fallback-reworked branch from 08e8529 to ea4a2c0 Compare September 23, 2021 13:15
@v4hn
Copy link
Contributor Author

v4hn commented Sep 23, 2021

I finally finished splitting&rebasing the branch.
I included most of your commits in one of the three PRs.

I explicitly excluded v4hn@ece9cd0 because I changed the logic quite a lot in my own commit and I believe your patch can result in pushing multiple states to the same child and thus mess with pruning because there is nothing connecting the entry in pending_ to the corresponding call to runCompute. It's quite entangled and I did not attempt to write a test.

@v4hn v4hn force-pushed the mtc-fallback-reworked branch from ea4a2c0 to cd980ad Compare September 26, 2021 21:16
@v4hn
Copy link
Contributor Author

v4hn commented Oct 4, 2021

@rhaschke if you should have time after summer school, iros and DGR days, this should be ready for merge (once more).

@rhaschke
Copy link
Contributor

Sorry for the delay. I will try to have a look into this over the weekend latest.

Both, failed and pruned states might get re-enabled later!
This also required rework (simplification) of the sorting function for pending pairs.
- Centrally distinguish between have owner() or not in InterfaceState::updatePriority()
- Have a separate updateStatus() method to just update the pruning status
- Split Interface::updatePriority() into a method taking the InterfaceState*
  and one taking an Interface::iterator (for efficiency)
- Early return in container.cpp's updateStatePrios()
- Switch directions: FORWARD <-> BACKWARD to make the function reusable for status propagation.
- We need to ignore the source state when looking for opposite states of the target state.
  Thus add both, source and target state arguments.
... to avoid explicit template initialization
@rhaschke rhaschke mentioned this pull request Nov 20, 2021
... to better indicate that such a state can be immediately re-enabled.
This also requires to drop the assertion in SerialContainer::onNewSolution()
that new solutions will have enabled start+end states (a CONNECT stage's solution might not).
As only the InterfaceState* variant is actually called,
we can drop the splitting introduced for performance reasons in
29d1e44
Not only propagate updates along solution paths, but also bridge
the gap of a `Connecting` stage.

- If a state becomes enabled, re-enable opposite `ARMED` states as well.
- If a state becomes pruned, also prune opposite states if they don't have alternatives.
- Make sure that we don't run into a recursive update loop by disabling notify() callbacks.
Also remove unused pushInterface(dir)
to allow propagating status updates only if the STATUS actually changed.
... to use operator<<
Copy link
Contributor

@rhaschke rhaschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me quite a while to understand your refactorings as you provided only very sparse documentation, both in commit comments and in code 😞
The logic of switching between start and end queue is probably motivated by this commit comment of 23f32d7:

with a single ordered queue, jobs from one direction were always preferred

However, I don't see this imbalance (yet). I suggest having a chat these days to clarify this.
Considering a propagator stage, I would expect that a single queue will order states from both ends by their priority, which is exactly what I would expect. I think we dropped the stage, which could propagate from either side anyway, didn't we?

For CONNECT-like interfaces, I think we need a different approach, namely a job queue of (start, end) pairs, much like in the Connecting stage.

Hence we will have 3 implementations: for generator, propagator, and connect stages.
To simplify the distinction between those, I suggest having 3 FallbacksPrivate classes deriving from a common base class. When initializing the interface in initializeExternalInterfaces() (which would be implemented by the base class), the actually required class type could be instantiated. This separation would nicely separate data structures required for these variants and avoid the interface-type-based switches. What do you think?

Most review comments I directly implemented in some additional commits. Please have a look. I'm going to merge with #309 as well and check a disabled unit test there...

Comment on lines 981 to 984
current.valid = (*current.state.stage)->pimpl()->canCompute();

if(!current.valid)
advanceCurrentStateToNextChild();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation here doesn't consider the PRUNED / FAILED status of the state(s). Connecting states can get reactivated if an opposite state occurs. However, here you immediately advance to the next child.
I don't believe the current approach scales to CONNECT-like interfaces, but it requires an extra pending list of state pairs (from start and end).

@rhaschke
Copy link
Contributor

Superseded by #311.

@rhaschke rhaschke closed this Nov 25, 2021
rhaschke added a commit to ubi-agni/moveit_task_constructor that referenced this pull request Nov 26, 2021
While the previous adaption makes moveit#287 fail, it still succeeded here.
Now it fails here as well ;-)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants