Skip to content

Conversation

@Narsil
Copy link
Contributor

@Narsil Narsil commented Jan 6, 2021

What does this PR do?

  • Currently conversations contain some state (conversation.history namely).
  • There is no obvious way to create a conversation from pure logs aside from mutating state.
  • The actual result is still buggy because history is not correctly
    updated by the Conversation object.

Objectives of this PR:

  • Enable creation of a Conversation from existing exchanges.
    Conversation("Why do you recommend it ?", past_user_inputs=["Can you recommend a book ?"], generated_responses=["I recommend reading the Lord of the Rings."])
  • Keep relatively close to previous code.
  • Fix the bug, that simply discarded history if you created a Conversation through mutation of state. (Could be backward incompat)
  • history renamed _history + _index as it's now treated as a cache variable (namely to prevent recreating tokens of the conversation all the time.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@mfuntowicz
@sgugger

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

@patrickvonplaten
Copy link
Contributor

That's really cool! Also pinging @guillaume-be here as he might is the original author of the pipeline :-)

Narsil added 2 commits January 6, 2021 14:17
+ remove dead enumerate
+ improve warning message.
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very cool!

history.extend(new_history)
conversation._index = i + index + 1
conversation._history = history
return history[:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return history[:]
return history

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, and even more important, don't pass back a reference to users to cached values. They can do whatever they want with the copy, if it's a reference then they might mess it up:

history = conversation._get_history()
history.append("a")  # Now conversation._history also contains "a" if I'm not mistaken

self.assertRaises(Exception, nlp, self.invalid_inputs)


class DummyTok:
Copy link
Collaborator

@sgugger sgugger Jan 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class uses torch so it should probably be inside an if is_torch_available() block.

Edit: Looks fine in the tests as long as we don't instantiate it without torch present so feel free to ignore this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ignore for the time being. I think this might change anyway linked to this discussion: #9432 (comment)


@require_torch
def test_integration_torch_conversation(self):
nlp = self.get_pipeline()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, let's all stop naming a pipeline with a generic nlp. It's not the whole field of nlp, but a conversional agent, so conversational_agent or something in the same vein/shorter would be better here :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all up for it ! Do you mind if we keep this for a separate PR ?
nlp = pipeline is used pretty much everywhere in tests, I don't want to break uniformity here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind, but I'd really like this done. Let me know if you don't plan to tackle it soon and I'll make a good first issue out of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a good first issue, It seems to really be a big replace but it's always nice to have those kind of issues for newcomers, right ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @sgugger here. Also I think it's no problem to break "naming" unity in the tests as long as the new name is better. It's easier to do a good first issue if one has this name as a reference of how it should be done IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. I'll change the name here and create First issue, feel free to edit afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

@patrickvonplaten
Copy link
Contributor

Also, @Narsil do you if it's possible to have a chat not widget in the inference API for this pipeline? I think it would be really nice to place around Blenderbot and DialoGPT

@Narsil
Copy link
Contributor Author

Narsil commented Jan 7, 2021

Also, @Narsil do you if it's possible to have a chat not widget in the inference API for this pipeline? I think it would be really nice to place around Blenderbot and DialoGPT

@patrickvonplaten it's in the pipes, but I've not yet created the widget for huggingface.co, the api-inference is ready though.

@patrickvonplaten, @sgugger can you please re-review. There sort of major bug, where we used

tokenizer.encode(inputs, add_special_tokens=False) so that BOS end EOS were not added on models that required them (instead EOS was added "manually" by the pipeline, leading to poor results on Blenderbot for instance).

Ping @mfuntowicz to make sure we can safely remove that or if there was a strong reason for bypassing tokenizer logic there.

@Narsil
Copy link
Contributor Author

Narsil commented Jan 7, 2021

Also changed the tokenizer behavior to use a real one.

@guillaume-be
Copy link
Contributor

guillaume-be commented Jan 7, 2021

Thanks for looping me in! It looks like there are a lot of changes, a few comments on my side:

  • regarding the change from
inputs = self.tokenizer(inputs, add_special_tokens=False, padding=False).get("input_ids", [])
for input in inputs:
   input.append(self.tokenizer.eos_token_id)

to:

inputs = self.tokenizer(inputs, **kwargs).get("input_ids", [])

are you sure that the behaviour remains correct for DialoGPT? As far as I know DialoGPT uses the GPT2 tokenizer that does not add a eos automatically at the end of the encoded input. Test for BlenderBot were added in

def test_integration_torch_conversation_encoder_decoder(self):
and I did not observe a poor performance back then - did something change? Also note that BlenderBot does not seem to require a BOS token (
def build_inputs_with_special_tokens(self, token_ids_0: List[int], token_ids_1: List[int] = None):
)

  • The if len(new_input) > max_length - self.min_length_for_response was set-up to allow the history to leave some space for future responses. Is this now done as part of the history further capabilities?
  • Could you please clarify the need for _get_history instead of accessing the history directly?
  • Regarding the title of the PR, if you are interested I added this feature to the Rust version of this pipeline a few months ago. The approach seems simpler than the changes proposed here, am I missing something? See https://github.com/guillaume-be/rust-bert/blob/7890d2daffea8e2c792a2e8930294e403b2321dd/src/pipelines/conversation.rs#L416 for reference (I see from your activity that you are familiar with Rust!)

Thanks!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM!

@Narsil
Copy link
Contributor Author

Narsil commented Jan 8, 2021

Hi @guillaume-be ,

Those changes do not belong in this PR anyway, I'll make a separate PR following this one, we should continue the discussion over there.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Narsil Narsil merged commit 02e05fb into master Jan 8, 2021
@Narsil Narsil deleted the stateless_conversation branch January 8, 2021 13:33
@LysandreJik
Copy link
Member

@Narsil
Copy link
Contributor Author

Narsil commented Jan 11, 2021

Yes, was missing a rebase before test, another commit introduced a new warning, which broke the test.

I am not sure what's the strategy concerning warnings and tests. I've tried to be conservative (meaning explicitly testing them), but I know it might become cumbersome at some point, I can remove those checks if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants