Blocking repeated ngrams during beam search #5216

danieldeutsch · 2021-05-20T20:48:12Z

Closes #5205.

Changes proposed in this pull request:

Adds a new Constraint abstract class for enforcing constraints during beam search
Adds a NGramBlockingConstraint class to prevent decoding repeated ngrams.

This is a work in progress and maybe close to finishing, so I wanted to get feedback. @epwalsh, any thoughts on

The Constraint interface? I limited the methods to just those which I needed to implement the ngram blocking
How to more efficiently implement the ngram blocking? I'm not sure there's a way to get around maintaining which ngrams have appeared before in a dictionary.
How to test the end-to-end beam search with the ngram blocking constraint? I've tested the individual methods and some toy examples on my own. To really test it, I would need to come up with a transition matrix which has repeated ngrams by default and then block them. I couldn't think of a simple one which wouldn't require a lot of manual effort to ensure the output is correct.

epwalsh

This looks pretty good so far. I do think having an end-to-end test is important. Is it feasible to at least make a test that assumes ngram_size is 1?

One comment I have about the API for the Constraint class is that having to deal with backpointers in update_state() is not ideal. Can we automatically handle reordering the state list outside of update_state(), either in the BeamSearch class or the Constraint base class?

allennlp/nn/beam_search.py

danieldeutsch · 2021-05-25T21:34:08Z

In this version:

I incorporated your comments on the code
I moved the logic to copy the parent's state into the Constraint class
I added end-to-end unit tests for using the n-gram blocking in BeamSearch with a search that I manually traced. See test_take_repeated_ngram_step

I removed "WIP" from the PR title because I think this version is complete to me, pending comments.

epwalsh

Looks great @danieldeutsch! Just a few more minor comments.

epwalsh · 2021-05-26T21:04:01Z

CHANGELOG.md

@@ -33,6 +33,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added a `min_steps` parameter to `BeamSearch` to set a minimum length for the predicted sequences.
 - Added the `FinalSequenceScorer` abstraction to calculate the final scores of the generated sequences in `BeamSearch`. 
 - Added `shuffle` argument to `BucketBatchSampler` which allows for disabling shuffling.
+- Added a `Constrant` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`.


Suggested change

- Added a `Constrant` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`.

- Added a `Constraint` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`.

epwalsh · 2021-05-26T21:04:23Z

allennlp/nn/beam_search.py

+class Constraint(Registrable):
+    """
+    An abstract class that can be used to enforce constraints on the output predictions
+    by manipulate the class log probabilities during beam search.


Suggested change

by manipulate the class log probabilities during beam search.

by manipulating the class log probabilities during beam search.

epwalsh · 2021-05-26T21:07:31Z

allennlp/nn/beam_search.py

+
+    The `apply()` method should manipulate the `class_log_probabilities` in place to enforce the constraint
+    for this step of beam search. For instance, it may prevent a specific class from being selected by setting
+    the corresponding log probability to `-inf` (by using `min_value_of_dtype(class_log_probabilities.dtype)`).


Suggested change

the corresponding log probability to `-inf` (by using `min_value_of_dtype(class_log_probabilities.dtype)`).

the corresponding log probability to a negligible value by using `min_value_of_dtype(class_log_probabilities.dtype)`, which is essentially equivalent to `-inf`.

epwalsh · 2021-05-26T21:14:25Z

allennlp/nn/beam_search.py

+            for constraint, constraint_state in zip(self.constraints, constraint_states):
+                # shape: (batch_size, beam_size, num_classes)
+                reshaped_class_log_probabilities = class_log_probabilities.view(
+                    batch_size, self.beam_size, -1
+                )


Any reason not to put the reshape outside of the loop?

Suggested change

for constraint, constraint_state in zip(self.constraints, constraint_states):

# shape: (batch_size, beam_size, num_classes)

reshaped_class_log_probabilities = class_log_probabilities.view(

batch_size, self.beam_size, -1

)

# shape: (batch_size, beam_size, num_classes)

reshaped_class_log_probabilities = class_log_probabilities.view(

batch_size, self.beam_size, -1

)

for constraint, constraint_state in zip(self.constraints, constraint_states):

epwalsh · 2021-05-26T21:16:21Z

allennlp/nn/beam_search.py

+    - `class_log_probabilities`, a tensor of shape `(batch_size, beam_size, num_classes)` that contains the
+    log probabilities for the classes during search. The first time `apply()` is called, `beam_size = 1`.
+
+    The `apply()` method should manipulate the `class_log_probabilities` in place to enforce the constraint


I think it's more natural for the apply() method to return new class_log_probabilities. It's a little easier to reason about.

epwalsh

LGTM, thanks @danieldeutsch!

danieldeutsch · 2021-06-01T19:13:44Z

Thanks @epwalsh for finishing it. I forgot about it over the holiday

JohnGiorgi · 2021-07-21T16:37:05Z

allennlp/nn/beam_search.py

+                    backpointer = last_backpointer[i, j].item()
+                batch_state.append(copy.deepcopy(state[i][backpointer]))
+            new_state.append(batch_state)
+        return new_state


@danieldeutsch @epwalsh Sorry, I know this PR is closed but I have a question that's easiest to ask here because of the existing context.

It seems that last_backpointer, which is passed to _copy_state by update_state will always be None (its never provided to update_state in BeamSearch). That would mean that backpointer will always by 0 and then then it will always be state[i][0] being copied, regardless of the actual timestep. Isn't this a problem? The tests don't catch this because they provide backpointer to update manually (see my other comment).

Yes, it looks like you are right to me. I think the fix should be to pass backpointer here, right?

allennlp/allennlp/nn/beam_search.py

Lines 1095 to 1098 in 5b1da90

for i, constraint in enumerate(self.constraints):

constraint_states[i] = constraint.update_state(

constraint_states[i], restricted_predicted_classes

)

JohnGiorgi · 2021-07-21T16:37:52Z

tests/nn/beam_search_test.py

+            ],
+        ]
+        predictions = torch.LongTensor([[5, 6], [0, 3]])
+        backpointers = torch.LongTensor([[1, 1], [0, 1]])


backpointers provided here to update, but not in BeamSearch (see my other comment).

I would need to retrace the beam search that I used to write this test, but my guess is that this is not caught because the backpointer is always 0

allennlp/tests/nn/beam_search_test.py

Line 774 in 5b1da90

def test_repeated_ngram_blocking_end_to_end(self):

* Implementing blocking repeated ngrams * Adding comment * Adding unit tests for the end to end beam search * Renaming class * Adding comment about function * Simplifying indexing to variable * Refactoring the state copying into the class * Reformatting * Editing changelog * fix line too long * comments * doc updates Co-authored-by: Pete <[email protected]> Co-authored-by: epwalsh <[email protected]>

danieldeutsch added 3 commits May 20, 2021 16:37

Implementing blocking repeated ngrams

f312787

Merging main

02481cc

Adding comment

87188e1

epwalsh reviewed May 20, 2021

View reviewed changes

allennlp/nn/beam_search.py Outdated Show resolved Hide resolved

allennlp/nn/beam_search.py Outdated Show resolved Hide resolved

allennlp/nn/beam_search.py Outdated Show resolved Hide resolved

danieldeutsch added 7 commits May 25, 2021 16:40

Adding unit tests for the end to end beam search

10d9b4d

Renaming class

7a750cd

Adding comment about function

b0a527b

Simplifying indexing to variable

98a490f

Refactoring the state copying into the class

1ad9e70

Reformatting

c4b63e6

Editing changelog

a895605

danieldeutsch changed the title ~~[WIP] Blocking repeated ngrams during beam search~~ Blocking repeated ngrams during beam search May 25, 2021

epwalsh and others added 2 commits May 26, 2021 13:47

Merge branch 'main' into ngram-blocking

7b0849e

fix line too long

2de6bb0

epwalsh suggested changes May 26, 2021

View reviewed changes

dirkgr assigned epwalsh May 27, 2021

epwalsh and others added 3 commits June 1, 2021 10:03

comments

0383012

Merge branch 'main' into ngram-blocking

6d52f02

doc updates

4920608

epwalsh approved these changes Jun 1, 2021

View reviewed changes

epwalsh enabled auto-merge (squash) June 1, 2021 17:27

epwalsh merged commit c014232 into allenai:main Jun 1, 2021

danieldeutsch deleted the ngram-blocking branch June 1, 2021 19:12

JohnGiorgi reviewed Jul 21, 2021

View reviewed changes

epwalsh mentioned this pull request Jul 22, 2021

fix constraint bug in beam search, clean up tests #5328

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blocking repeated ngrams during beam search #5216

Blocking repeated ngrams during beam search #5216

danieldeutsch commented May 20, 2021

epwalsh left a comment

danieldeutsch commented May 25, 2021

epwalsh left a comment

epwalsh May 26, 2021

epwalsh May 26, 2021

epwalsh May 26, 2021

epwalsh May 26, 2021

epwalsh May 26, 2021

epwalsh left a comment

danieldeutsch commented Jun 1, 2021

JohnGiorgi Jul 21, 2021 •

edited

Loading

danieldeutsch Jul 21, 2021

JohnGiorgi Jul 21, 2021

danieldeutsch Jul 21, 2021

	- Added a `Constrant` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`.
	- Added a `Constraint` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`.

	by manipulate the class log probabilities during beam search.
	by manipulating the class log probabilities during beam search.

	the corresponding log probability to `-inf` (by using `min_value_of_dtype(class_log_probabilities.dtype)`).
	the corresponding log probability to a negligible value by using `min_value_of_dtype(class_log_probabilities.dtype)`, which is essentially equivalent to `-inf`.

	for i, constraint in enumerate(self.constraints):
	constraint_states[i] = constraint.update_state(
	constraint_states[i], restricted_predicted_classes
	)

Blocking repeated ngrams during beam search #5216

Blocking repeated ngrams during beam search #5216

Conversation

danieldeutsch commented May 20, 2021

epwalsh left a comment

Choose a reason for hiding this comment

danieldeutsch commented May 25, 2021

epwalsh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

epwalsh left a comment

Choose a reason for hiding this comment

danieldeutsch commented Jun 1, 2021

JohnGiorgi Jul 21, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JohnGiorgi Jul 21, 2021 •

edited

Loading