You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Image this scenario:
We need to capture 10 digit mobile numbers uttered by the user from transcripts using ListEntityPlugin.
"alternatives": [
[
{
"confidence": 0.9299549,
"transcript": "my number is 9041270333"
},
{
"confidence": 0.9299549,
"transcript": "my claim is 1041270333"
},
{
"confidence": 0.9299549,
"transcript": "my claim is 1041270333"
}
]
],
Passing the following pattern inside ListEntityPlugin:
we capture the following (entity-type, value group): {('number_pattern', 'mobile'): [KeywordEntity(body='9041270333', type='number_pattern', parsers=['ListEntityPlugin'], score=0, alternative_index=0, alternative_indices=None, latent=False, value='mobile', entity_type='number_pattern', _meta={})]
Due to this, we miss out on the other more recurring number i.e 1041270333.
In order to include this value, we can replace the above group-by logic with this:
Disclaimer:
The above suggested group-by aggregation will not work with Datetime entities, can someone suggest a better alternative to handle this particular edge case.
The text was updated successfully, but these errors were encountered:
keshav47
changed the title
Petition to refactor, group_by(entity.type, entity.get_value()) with group_by(entity.body, entity.get_value()) inside entity_consensus functionality.
Treat entities with same type and value but dissimilar bodies as different entities?
Mar 29, 2022
Describe the bug
Image this scenario:
We need to capture 10 digit mobile numbers uttered by the user from transcripts using ListEntityPlugin.
Passing the following pattern inside ListEntityPlugin:
Applying the current logic inside entity_consensus function, i.e
we capture the following (entity-type, value group):
{('number_pattern', 'mobile'): [KeywordEntity(body='9041270333', type='number_pattern', parsers=['ListEntityPlugin'], score=0, alternative_index=0, alternative_indices=None, latent=False, value='mobile', entity_type='number_pattern', _meta={})]
Due to this, we miss out on the other more recurring number i.e
1041270333
.In order to include this value, we can replace the above group-by logic with this:
this will capture the other entity like this:
{('9041270323', 'mobile'): [KeywordEntity(body='9041270333', type='number_pattern', parsers=['ListEntityPlugin'], score=0, alternative_index=0, alternative_indices=None, latent=False, value='mobile', entity_type='number_pattern', _meta={})], {('1041270333', 'mobile'): [KeywordEntity(body='1041270333', type='number_pattern', parsers=['ListEntityPlugin'], score=0, alternative_index=0, alternative_indices=None, latent=False, value='mobile', entity_type='number_pattern', _meta={})]
Disclaimer:
The above suggested group-by aggregation will not work with Datetime entities, can someone suggest a better alternative to handle this particular edge case.
The text was updated successfully, but these errors were encountered: