Skip to content

Conversation

@renatobecker
Copy link
Contributor

@renatobecker renatobecker commented Oct 24, 2020

Proposed changes

The issue happens when running the Concurrent Chats limit feature in a high-performance environment. Due to simultaneous and concurrent processes, the algorithm(Load Balancing) routes to agents more than the acceptable number of ongoing chats they can assist to.

So, in order to avoid this issue, we're implementing a specific process that will handle the Omnichannel queue and will ensure that each process is running sequentially. Given that each department is considered a queue, we're racing all active queues(departments) that are present in the waiting queue to make sure the queue won't get stuck.
Let say all agents from a specific department are busy, we need to allow agents from other departments to assist the chats associated with departments they are part of.

In addition, an important improvement has been added to the code. Rather than reading the room collection to get the number of ongoing chats per agent, the new version is going the read the subscription collection. This is another performance improvement since the subscriptions are ephemeral when it comes to Omnichannel chats.

Issue(s)

How to test or reproduce

  • Set up Waiting Queue Feature and Max. Number of Chats setting
  • Use a load test tool to fire HTTP requests to api/v1/livechat/room
  • When firing more than 50 threads per second, the issue arises

Screenshots

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves a current function)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Hotfix (a major bugfix that has to be merged asap)
  • Documentation Update (if none of the other choices apply)

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works (if applicable)
  • I have added necessary documentation (if applicable)
  • Any dependent changes have been merged and published in downstream modules

Changelog

Further comments

@renatobecker renatobecker added this to the 3.8.0 milestone Oct 24, 2020
@renatobecker renatobecker changed the title [WIP][IMPROVE] Omnichannel concurrent chats limit [WIP][IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit Oct 24, 2020
@renatobecker renatobecker changed the title [WIP][IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit [WIP][IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit feature Oct 24, 2020
@renatobecker renatobecker changed the title [WIP][IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit feature [IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit feature Oct 25, 2020
@renatobecker renatobecker changed the title [IMPROVE][ENTERPRISE] Omnichannel concurrent chats limit feature [FIX][ENTERPRISE] Race condition on Omnichannel queues Oct 26, 2020
@renatobecker renatobecker marked this pull request as ready for review October 26, 2020 20:04
return this.findOne(query);
}

async getDistinctQueuedDepartments() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to not use distinct from mongodb driver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial implementation was using distinct, but since undefined value is ignored when passing any filter(such as { status: 'queued' }) then I opted for an aggregation.
I implemented the distinct method again and now I'm adding undefined as default.
Undefined= public queue(when the inquiry is not associated with any department)

},
nextQueue() {
if (!this.pool.length) {
this.pool = [...this.activeQueues];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you not calling getActiveQueues here instead of having it being called every 5 seconds? I didn't understand the whole process after first look.. 😅

I think making it clear that the pool is being populated only if it is empty might help understanding the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, your suggestion is exactly how it was implemented :D
The point is that I was thinking about a small number of different queues, then the method will be called more often but since we're now running it sequentially, there's no problem calling it this way.

},
async getActiveQueues() {
// undefined = public queue(without department)
return [undefined].concat(await LivechatInquiry.getDistinctQueuedDepartments());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think on a better approach yet, but having an undefined item is weird 😂

Copy link
Contributor Author

@renatobecker renatobecker Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely 🙃
That's the reason I was using aggregate to fetch the data rather than distinct.

@sampaiodiego sampaiodiego merged commit 3995eec into develop Oct 27, 2020
@sampaiodiego sampaiodiego deleted the omnichannel/concurrent-chats-limit-feature-failing branch October 27, 2020 22:44
@sampaiodiego sampaiodiego mentioned this pull request Nov 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants