This is the companion repository to the Web Hack Wednesday series 3 episode on Advanced LUIS with Gigseekr. See the video at https://youtu.be/3eZqSJlbkEI
In the spring of 2017, we worked with Gigseekr on a chat bot application that enables searching for live music events. The chat bot was written using the Microsoft Bot Framework and makes great use of Language Understanding Intelligent Service (LUIS).
In this episode we build on the LUIS and HowHappy.co.uk mashup episode we did in series 2 by looking at what is new with LUIS and some of the more complex scenarios LUIS now supports - all in the context of GigSeekr.
If you want to dive deeper on the work we did with Gigseekr, you can read the full technical case study Gigseekr builds a live music discovery bot using Bot Framework and LUIS.
Gigseekr are a small company based in the UK who describe themselves as follows "gigseekr is a live music discovery service committed to helping you find tickets and stay up to date with tours & events across the UK".
We worked with Gigseekr on a bot for Skype, Facebook and Cortana which lets user find live music events, artists, venues and other details related to live music. You can read the full case study of the project here: Gigseekr builds a live music discovery bot using Bot Framework and LUIS.
You can see a copy of the LUIS model we used in the show here: [GigSeekr - simple.md](GigSeekr - simple.md).
This is not the actual LUIS model used by GigSeekr, instead it is a simplified version which we use as an example for scenarios like this.
In the show, we demonstrated the updated LUIS portal which you can get to at luis.ai.
We also talked about the updated and much improved documentation which you can access at luis.ai/help. In particular, the How to use pages are great for explaining LUIS concepts.
Gigseekr facilitate many types of search based around thw following key entities:
- Artist
- ArtistType
- EventType
- Genre
- Location
- Venue
- DateTime
We discussed the basics of what an intent is during the Season 2 WHW episode. In this episode we looked how Gigseekr use intents.
The EventSearch
intent supports user searching for events using any combination of the entities that are supported. This intent supports utterances such as:
i'm looking for [ $Genre ] [ $ArtistType ] [ $EventType::Gig ] in [ $Location ] on [ $datetimeV2 ]
i want to see a [ $ArtistType ] [ $EventType::Gig ] [ $Location ] on [ $datetimeV2 ]
whos on at [ $Venue ]
The EntitySearch
intent supports users searching for information on specific artists, events or venue (GigSeekr called these 'entities'). This intent supports utterances such as
when are [ $Artist ] next playing in [ $Location ]
tell me about [ $Artist ]
info on [ $Venue ]
We also used a pre-built domain intent around utilities which helps facilitate common utility utterances such as "Cancel", "Start again" and "Help".
Pre-built domains are a new feature in LUIS which encompass pre-built sets of intents and entities that work together for domains or common categories of apps. The pre-built domains have been pre-trained and are ready for use. The intents and entities in a pre-built domain are fully customizable once you've added them to your app - you can train them with utterances from your system so they work for your users. You can use an entire pre-built domain as a starting point for customization, or just borrow a few intents or entities from a domain for your application.
Read more on pre-built entities and pre-built domains as well as the Cortana pre-built app.
Features are a distinguishing trait or attribute of data that your system observes. You add features to a language model, to provide hints about how to recognize input that you want to label or classify. Features help LUIS recognize both intents and entities, but features are not intents or entities themselves. Instead, features might provide examples of related terms, or a pattern to recognize in related terms.
There are two type of feature, both of which are used in the Gigseekr model:
Phrase lists are a list of words/phrases that belong to the same class and should be treated similarly.
The maximum length of a phrase list is 5000 items. You may have a maximum of 10 phrase lists per LUIS app.
Gigseekr use phrase lists to help LUIS identify Artist, ArtistType (Band, solo, rapper etc), Genre (Rocks, Blues etc), Venues and UK Locations (cities, towns, villages).
Location was implemented as a phrase list rather than the pre-built 'geography' entities because we found that the pre-built entity does not do a great job of recognising UK locations, especially smaller ones such as towns and villages. This phrase list was trained with the top 1000 uk place-names by population.
Pattern Features specify a regular expression to help LUIS recognize regular patterns that are frequently used in your application's domain.
A good example in the live music domain is recognition of ticket references. One of the main ticket selling companies is 'WeGotTickets' who issue ticket confirmation numbers which match this formula [t][14 digit number][b1].
A pattern feature using the ([t])(\d+)[b][1]
as a regex pattern would help LUIS to recognise 'WeGotTickets' confirmation numbers as tickets and would enable thing like users getting details of an event that they already have a ticket for.
The training and testing capability has improved to support interactive testing where you can test with specific phrases and import batches of phrases to test the capabilities of the model
In summary, the LUIS user interface and functionality has improved a lot since our initial episode in 2016. Companies like GigSeekr are using LUIS to create compelling natural language conversational interface on bots, mobile apps and web applications.