Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite Entities as part of Rasa NLU #3765

Closed
BeWe11 opened this issue Jun 12, 2019 · 33 comments
Closed

Composite Entities as part of Rasa NLU #3765

BeWe11 opened this issue Jun 12, 2019 · 33 comments
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@BeWe11
Copy link

BeWe11 commented Jun 12, 2019

Description of Problem:
Rasa currently has no native support for composite entities (entities that consist of multiple sub-entities). This feature is available in competitors like wit.ai and dialogflow.

Overview of the Solution:
A while ago, I've implemented composite entities as a custom component. This has worked for my use-cases. However, having this functionality as a separate component has some serious drawbacks:

  1. The feature is basic enough that it's annoying to have to install a separate library
  2. Getting the required training data for the composite definitions is messy, as there is no canonical way to pass additional training data fields to components
  3. Because of 2), I have to rely on some internal methods that have changed their signature multiple times and broke things.

My proposal is now that I take the functionality from this component and put it in a pull request to make it available throughout rasa. Definitions of composite entities would then be first class training data, meaning that they are defined the same way as e.g. lookup tables.

For example a markdown file containing NLU training data could contain another category defining composites of entities (a "@" is used to mark entity names).

## composite:car
- @color @brand 

I'm open to discussion about the specifics of my implementation. Is there any argument against proceeding with this? One argument I could imagine is not wanting to increase the number of fields in the training files further.

@BeWe11 BeWe11 added the type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR label Jun 12, 2019
@akelad
Copy link
Contributor

akelad commented Jun 13, 2019

Hi @BeWe11 thanks for the suggestion! We actually decided against merging a feature like this a while ago: #1475
That's not to say we won't consider merging yours -- could you provide some more details to your approach?

@BeWe11
Copy link
Author

BeWe11 commented Jun 14, 2019

Hey @akelad, thank you for your response and the link, I'm aware of the previous pull request.

The approach of the closed pull request had various assumptions about the training data built in, e.g. about time and date entities, it used lookup tables to find entities etc.

My approach is a simple regex check and it only kicks in after all other entity extractors have been applied. It just checks whether a predefined pattern (say @color @product with @extras) matches a parsed sentence structure (say red shoes with shoelaces where red, shoes and shoelaces have been recognized as entities of type color, brand and extras). I'd like to refer you to the Github page that I've linked for a full example.

This is very similar to the dialogflow implementation. There is no complicated interaction to other extractors and it doesn't require any restructuring of training data. The only thing required is a definition of some named patterns. The way I'm imagining it, these definitions would be placed alongside other "extra" definitions like lookup tables, for example:

{
    "rasa_nlu_data": {
        "common_examples": [],
        "regex_features" : [],
        "lookup_tables"  : [],
        "entity_synonyms": [],
        "composite_patterns": [
            {
               "name": "product",
               "patterns": [
                   "@color @brand with @extras"
               ]
            }
        ]
    }
}

I can understand that you might not want to include yet another category in the training data. I'd like to argue that the feature of composite entities is so basic that there might be a case for making the data available natively.

@akelad
Copy link
Contributor

akelad commented Jun 17, 2019

Ok cool, thanks for the detailed info! We'll discuss and let you know

@tmbo tmbo added type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. area:rasa-oss 🎡 Anything related to the open source Rasa framework and removed requires disussion labels Jun 25, 2019
@MetcalfeTom
Copy link
Contributor

Hi @BeWe11,

I think your composite_patterns definition is a very intuitive implementation, however does it generalise? i.e. in this example does every composite entity need to contain the with token in order to be recognised?

@BrianYing
Copy link

Hi @MetcalfeTom ,

I have been using @BeWe11 's composite_entity_extractor for a while and everything works great for me. From my understanding and experience, I think the with token is not necessary for every composite entity. The patterns is the list of all combinations of entities that you want to be recognized as the name (that includes both entity and text string). In this example

{
    "name": "product_with_attributes",
    "patterns": [
      "@color @product with @pattern",
      "@pattern @color @product"
    ]
}

Both pattern @color @product with @pattern and @pattern @color @product will be recognized as product_with_attributes, where in the first case "with" is a string. So it is also possible to add another pattern as @color @product of @pattern and it will also be recognized as the same composite entity.

Thanks for considering this repo. Composite entity is a feature that we used a lot so really hope rasa could support it natively.

@BeWe11
Copy link
Author

BeWe11 commented Jul 9, 2019

Hey @MetcalfeTom,

with patterns being regexes, the composites generalize to everything that can be expressed as a regex. For example, the with could be made optional by using

@color @product (?:with )?@pattern

Alternatively, you could just define multiple patterns for the same composite entity as in @BrianYing's example.

Using raw regexes for pattern definitions provides a lot of flexibility. The downside is that defining regexes for complex use-cases can be quite tricky. Defining patterns is kind of a "set-and-forget" task though, so the trade-off might be justified.

@amn41
Copy link
Contributor

amn41 commented Jul 9, 2019

hey @BeWe11 - thanks for that. I created a draft PR with an alternative approach where we train a second CRF. This might be more flexible because it could allow for entity roles as well as compound entities. Very much WIP though #3889

@BeWe11
Copy link
Author

BeWe11 commented Jul 10, 2019

Hey @amn41,

from skimming over your code, I'm not quite sure what these entity roles are and how they are implemented. Would roles be given in the training examples, i.e. one would have to tag individual entities as well as groups of entities per example sentence?

@amn41
Copy link
Contributor

amn41 commented Jul 10, 2019

haven't thought too much about the data format yet, but added some notes about the general idea to the PR description

@tmbo
Copy link
Member

tmbo commented Aug 20, 2019

I think next step is to better evaluate @amn41's proposal. @BeWe11 / @BrianYing do you have any data we could use to test this approach? (if you do not want to share it publicly but could share it under NDA, please write me a mail)

decision: we want this, question is how the detection is implemented (e.g. does an ML based approach like the one proposed by @amn41 work).

@tmbo tmbo added 🧪research and removed type:discussion 👨‍👧‍👦 Early stage of an idea or validation of thoughts. Should NOT be closed by PR. labels Aug 20, 2019
@cyrilthank
Copy link

cyrilthank commented Oct 3, 2019

Hi @tmbo @JustinaPetr can my organization leverage our existing partnership with rasa to work together on this for a custom solution?

@akelad
Copy link
Contributor

akelad commented Oct 8, 2019

Hey @cyrilthank which organization are you with?

@cyrilthank
Copy link

Hi @akelad I am with http://aiware.ai/ and rasa partnership with http://cleareye.ai/

@akelad
Copy link
Contributor

akelad commented Oct 9, 2019

@cyrilthank as far as I'm aware we don't have any partnerships with either of those companies. In any case though, the next step at the moment is still to evaluate alans approach to this - @amn41 do we have any updates on that?

@cyrilthank
Copy link

Thank you @akelad for your reply

Can you please advise the mail id i can send the partnership credentials/documentation to confirm that we do have a rasa partnership

my mail id is [email protected]

@akelad
Copy link
Contributor

akelad commented Oct 11, 2019

Hi @cyrilthank yes my bad, there was some miscommunication about that. Please contact us via email if you want to discuss anything regarding our partnership, as we would prefer not to do this in the community

@cyrilthank
Copy link

Thank you @akelad for patiently checking this. I understand completely 'multiple entities' causing......

Appreciate it if you could drop me a mail at [email protected] on how i may contact rasa-partnership team by email

Thank you for all your help

cyril

@JulianGerhard21
Copy link
Contributor

JulianGerhard21 commented Oct 15, 2019

Hi @cyrilthank, @akelad, @amn41,

actually I am interested in solving this too. If additional ressources are needed, you might want to consider me/us in - in case you dont want to do it internally.

Regards
Julian

@cyrilthank
Copy link

cyrilthank commented Oct 16, 2019

Hi @akelad please share the partnership related engagement information to
since we want to explore other use-cases too where we can collaborate to best leverage the partnership agreement for both our organizations

@akelad
Copy link
Contributor

akelad commented Oct 17, 2019

@cyrilthank you can get in touch with us via: [email protected]

@amn41 what's the update with your PR? Would be good to see if it makes sense for a community member to work on this :)

@amn41
Copy link
Contributor

amn41 commented Oct 17, 2019

sorry - I have an idea for a simpler way to achieve this but need to work out the details. Will discuss with @tabergma @Ghostvv

@cyrilthank
Copy link

Thanks @akelad i have reached out to [email protected]

@nbeuchat
Copy link
Contributor

Hi team, is it something you are still working on? Our organization really needs this and I saw that the other PR #3889 was closed.

@amn41
Copy link
Contributor

amn41 commented Nov 25, 2019

yes, we are! waiting on a couple of other NLU pieces to come together but working on a good solution

@cyrilthank
Copy link

cyrilthank commented Nov 26, 2019 via email

@nbeuchat
Copy link
Contributor

yes, we are! waiting on a couple of other NLU pieces to come together but working on a good solution

That's awesome :-) Looking forward to it! In the meantime, we'll use @BeWe11's solution as a workaround

@tabergma
Copy link
Contributor

tabergma commented Apr 6, 2020

@nbeuchat @BeWe11 We have a first working version of the composite entity feature ready. If you are still interested, I can share the feature with you so that you can test it before the actual release. Just let me know. Thanks.

@nbeuchat
Copy link
Contributor

nbeuchat commented Apr 6, 2020

Thanks for the update @tabergma ! I'd love to check it out :-)

@BeWe11
Copy link
Author

BeWe11 commented Apr 7, 2020

I am interested as well, @tabergma !

@shubhamnatraj
Copy link

@tabergma I am struggling with a similar use case too! If possible I would love to try using it and test it out

@tabergma
Copy link
Contributor

@shubhamnatraj We actually released the feature yesterday, see here. Would be great if you could share some feedback with us once you tested it. Thanks.

@shubhamnatraj
Copy link

@tabergma Thank you! I will try it out and share my feedback on the thread itself.

@akelad
Copy link
Contributor

akelad commented Apr 30, 2020

I would close this issue - feel free to still leave any feedback you have here or on the forum. And if there's any issues/enhancement requests with the current feature, please open a new issue

@akelad akelad closed this as completed Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
Development

No branches or pull requests