Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markers docs #10095

Merged
merged 37 commits into from
Nov 15, 2021
Merged
Changes from 8 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
a3da356
Marker docs
aeshky Nov 4, 2021
c0f5023
removed text that is left over from comments.
aeshky Nov 4, 2021
f190d87
small edits.
aeshky Nov 4, 2021
239b19b
updated docs.
aeshky Nov 5, 2021
e9418d4
fixing yaml spaces - export error
aeshky Nov 5, 2021
ec9ff43
yaml fixes
aeshky Nov 5, 2021
a2c91e0
changed formatting for notes and added a heading and an experimental …
aeshky Nov 5, 2021
a8f0965
removed the word "note"
aeshky Nov 5, 2021
f5a005a
removed duplicated CLI usage instructions.
aeshky Nov 10, 2021
6546961
fixed a few typos.
aeshky Nov 10, 2021
69a93ba
Rephrased the "Defining Markers" section to follow our doc's style gu…
aeshky Nov 11, 2021
6058973
Rephrased and fixed typos
aeshky Nov 11, 2021
1778727
Merge branch 'main' into marker_docs
aeshky Nov 11, 2021
3eb574a
Fixed the examples
aeshky Nov 11, 2021
b40ceb8
rephrasing the section Extracting markers, following the docs style g…
aeshky Nov 11, 2021
188a0dd
clarified that you need a tracker store to run markers.
aeshky Nov 11, 2021
176673b
clarified why there are `nan`s
aeshky Nov 11, 2021
ee45998
rewrote the introduction.
aeshky Nov 11, 2021
19c4230
Merge branch 'main' into marker_docs
aeshky Nov 11, 2021
475ee70
reformatted yaml.
aeshky Nov 12, 2021
ef6c80c
Merge branch 'marker_docs' of https://github.com/RasaHQ/rasa into mar…
aeshky Nov 12, 2021
c9d11ce
additional rephrasing and moving some sentences around.
aeshky Nov 12, 2021
9a89d99
Added changelog.
aeshky Nov 12, 2021
49f365e
Added `markers` to the sidebar
aeshky Nov 12, 2021
b66f5ee
Merge branch 'main' into marker_docs
aeshky Nov 12, 2021
4072436
Fixing typos, rephrasing, and adding clarification sentences.
aeshky Nov 12, 2021
d0ae4e5
Merge branch 'marker_docs' of https://github.com/RasaHQ/rasa into mar…
aeshky Nov 12, 2021
36ce3b0
updated the examples with a more recent run on markers on moodbot.
aeshky Nov 12, 2021
1fba5ef
Merge branch 'main' into marker_docs
aeshky Nov 12, 2021
235fddb
clarifying `nans`
aeshky Nov 12, 2021
3abbd97
Merge branch 'marker_docs' of https://github.com/RasaHQ/rasa into mar…
aeshky Nov 12, 2021
03913bb
placed markers under "evaluation" side bar
aeshky Nov 12, 2021
cc055e4
proof reading edits.
aeshky Nov 12, 2021
9c9851b
Merge branch 'main' into marker_docs
aeshky Nov 12, 2021
08a6726
Merge branch 'main' into marker_docs
aeshky Nov 12, 2021
0cab394
Merge branch 'main' into marker_docs
aeshky Nov 15, 2021
dd7d0c3
Merge branch 'main' into marker_docs
aeshky Nov 15, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
291 changes: 291 additions & 0 deletions docs/docs/markers.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,291 @@
---
id: markers
sidebar_label: Markers
title: Markers
aeshky marked this conversation as resolved.
Show resolved Hide resolved
description: Read more about how to mark points of interest in dialogues using logical expressions.
aeshky marked this conversation as resolved.
Show resolved Hide resolved
abstract: Markers are logical expressions that allow you to describe points in the dialogue that you are interested in.
aeshky marked this conversation as resolved.
Show resolved Hide resolved
---

:::caution

This feature is currently experimental and might change or be removed in the future. Share your feedback in the forum to help us make it production-ready.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

:::

Markers allow you to describe points in a dialogue that you are interested in identifying. In Rasa, a dialogue is represented as a sequence of events, which include bot actions that were executed (`ActionExecuted`), intents that were detected (`UserUttered`), and slots that were set (`SlotSet`). A point of interest in a dialogue can thus be expressed as a logical expression that can be evaluated against the sequence of dialogue events. When this logical expression is met, we "mark" that point in the dialogue for further analysis or inspection.
aeshky marked this conversation as resolved.
Show resolved Hide resolved
aeshky marked this conversation as resolved.
Show resolved Hide resolved

There are several applications for Markers. For example, they can be used to define your bot's Key Performance Indicators (KPIs), such as dialogue completion or task success. For [Carbon Bot](https://rasa.com/blog/using-conversation-tags-to-measure-carbon-bots-success-rate/) (which helps users offset their carbon emissions from flying) dialogue completion can be defined as "all mandatory slots have been filled", while task success can be defined as "all mandatory slots have been filled and a carbon estimate has been successfully computed". Marking these events allows us to evaluate Carbon Bot's performance by quantifying how often it succeeds and how often it fails.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

Markers also allow you to diagnose your dialogues by surfacing important events for further inspection. For example, we might observe that Carbon Bot tends to successfully set the `travel_departure` and `travel_destination` slots, but not the `travel_flight_class`. We can define a marker to quantify how often this scenario occurs and surface relevant dialogues for review as part of [Conversation Driven Development (CDD)](https://rasa.com/blog/conversation-driven-development-a-better-approach-to-building-ai-assistants/).
hsm207 marked this conversation as resolved.
Show resolved Hide resolved

Markers are defined using a simple `yaml` syntax. For example, here are markers that define dialogue completion and task success for Carbon Bot:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

```yaml
marker_dialogue_completion:
and:
- slot_was_set: travel_departure
- slot_was_set: travel_destination
- slot_was_set: travel_flight_class
marker_task_success:
and:
- slot_was_set: travel_departure
- slot_was_set: travel_destination
- slot_was_set: travel_flight_class
- action: provide_carbon_estimate
```

And here is a marker for surfacing dialogues where all mandatory slots are set except `travel_flight_class`:

```yaml
marker_dialogue_completion:
aeshky marked this conversation as resolved.
Show resolved Hide resolved
and:
- slot_was_set: travel_departure
- slot_was_set: travel_destination
- not:
slot_was_set: travel_flight_class
aeshky marked this conversation as resolved.
Show resolved Hide resolved
```

## Defining Markers

To describe the events that we're interested in, we provide `conditions` and `operators`.

Conditions are simple expressions that describe an event, and are used like to variables in a logical expression, for example `action: utter_greet`. Operators allow you to combine conditions or sub-marker definitions which include nested operators and conditions.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

aeshky marked this conversation as resolved.
Show resolved Hide resolved
### Conditions

We support the following conditions:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

- `action`: the specified bot action was executed.
- `intent`: the specified user intent was detected.
- `slot_was_set`: the specified slot was set.

We also support their negated forms:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

- `not_action`: the event is not an `ActionExecuted` with the specified action.
- `not_intent`: the event is not a `UserUttered` with the specified intent.
- `slot_was_not_set`: the specified slot has not been set.

### Operators

To combine conditions, we provide the following operators:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

- `and`: all sub-conditions applied.
- `or`: any of the sub-conditions applied.
- `not`: the sub-condition did not apply. `not` can have only 1 condition.
- `seq`: the list of sub-conditions applied in the specified order with any number of events occurring in-between.
- `at_least_once`: the listed sub-marker definitions occurred at least once. Only the first occurrence will be marked.
- `never`: the listed sub-marker definitions never occurred.

### Marker Configuration

To define one or more marker, you need to specify a `yaml` configuration file. Here is an example:

```yaml
aeshky marked this conversation as resolved.
Show resolved Hide resolved
marker_name_provided:
slot_was_set: name
marker_mood_expressed:
or:
- intent: mood_unhappy
- intent: mood_great
marker_cheer_up:
seq:
- intent: mood_unhappy
- action: utter_cheer_up
marker_bot_challenged:
at_least_once:
- intent: bot_challenge
marker_mood_expressed_and_name_provided:
and:
- slot_was_set: name
- or:
- intent: mood_unhappy
- intent: mood_great
```
aeshky marked this conversation as resolved.
Show resolved Hide resolved

The top level contains unique user-specified names for each marker. This is a dictionary mapping marker names to marker definitions.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

Each marker definition can contain an arbitrary combination of `conditions` and `operators` that map to a valid sub-marker definition.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

Each `condition` maps a condition tag to a value. For example, in `marker_name_provided`, the tag `slot_was_set` is mapped to the slot `name` which exists in the `domain.yaml` file.

Each `operator` is a dictionary mapping an operator tag (e.g., `and`) to a list of sub markers, which can be conditions or a nested operator-conditions definition (e.g., see `marker_mood_expressed_and_name_provided`).
aeshky marked this conversation as resolved.
Show resolved Hide resolved

:::note
You cannot yet reuse an existing marker name in the definition of another marker.
aeshky marked this conversation as resolved.
Show resolved Hide resolved
aeshky marked this conversation as resolved.
Show resolved Hide resolved

:::

## Running the Markers script

Markers are extracted from the dialogues already stored in your [tracker store](https://rasa.com/docs/rasa/next/tracker-stores). Once you've created your Marker definitions in the Makers configuration file, you can extract them by running the follow command:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

```bash
rasa evaluate markers markers.yaml extracted_markers.csv all
aeshky marked this conversation as resolved.
Show resolved Hide resolved
```

The argument `all` will cause the script to process all the trackers in your tracker store. To process a subset you can use a different tracker selection strategy, either `first_n` or `sample`. Refer to the CLI usage for details.

Each tracker in the tracker store can contain multiple sessions identified by a `sender_id`. When extracting markers and computing their statistics, each session is considered separately.


### Extracted Markers

We extract information about top-level markers which have user-specified names, and extract meta information about all the events where the markers applied.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

For the `at_least_once` marker we only extract meta information about the first occurrence.
aeshky marked this conversation as resolved.
Show resolved Hide resolved

We extract the following meta information:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

1. the number of user turns (i.e., the number of `UserUttered` events) preceding the event at which a marker applied
2. the index of the event at which a marker applied

We output the extracted information in a `.csv` file that contains the following columns:
aeshky marked this conversation as resolved.
Show resolved Hide resolved

- sender_id
- session_idx
- marker name
- event_idx (where we index all events in the tracker, starting with index 0)
- num_preceeding_user_turns
aeshky marked this conversation as resolved.
Show resolved Hide resolved

Here is a sample output:

```
sender_id,session_idx,marker_name,event_id,num_preceding_user_turns
Copy link
Contributor Author

@aeshky aeshky Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@usc-m and @ka-bu I noticed that we use marker_name in the extracted marker output, but use marker in the stats files. I would like to make it consistent if that's okay, switching to marker for both.

Not sure if I should do it in this PR or in another one (possibly combining it with any CLI changes). What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I'd split it out into it's own PR to not clutter this one too much (or combine with the CLI changes)

1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,7,1
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,7,1
```

extracted_markers.csv

### Computed statistics

By default, the command above also computes summary statistics about the meta information gathered. To disable the statistics computation, use the optional flag `--no-stats`.

For each session and each marker, we are interested in the number of user turns which preceded the event at which the marker applied.

Thus we compute the following:

1. **For each session and each marker** we compute "per-session statistics" which include the arithmetic mean, median, minimum, and maximum number of user turns preceding the event at which the marker applied
aeshky marked this conversation as resolved.
Show resolved Hide resolved
2. **For all sessions and for each marker** we compute
1. overall statistics including the arithmetic mean, median, minimum, and maximum number of user turns preceding the event where the respective marker applied in any session
2. the number of sessions and the percentage of sessions where each marker applied at least once

The results are stored in tabular format with the suffixes in the files `stats-overall.csv` and `stats-per-session.csv`. You can change the prefix `stats` via the command line.

Here is a sample output:

```
sender_id,session_idx,marker,statistic,value
1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,count(number of preceding user turns),1
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,count(number of preceding user turns),1
79e7604d2acc442d9f78e00b173786c9,0,marker_mood_expressed,count(number of preceding user turns),0
1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,max(number of preceding user turns),1
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,max(number of preceding user turns),1
79e7604d2acc442d9f78e00b173786c9,0,marker_mood_expressed,max(number of preceding user turns),nan
1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,mean(number of preceding user turns),1.0
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,mean(number of preceding user turns),1.0
79e7604d2acc442d9f78e00b173786c9,0,marker_mood_expressed,mean(number of preceding user turns),nan
1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,median(number of preceding user turns),1.0
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,median(number of preceding user turns),1.0
79e7604d2acc442d9f78e00b173786c9,0,marker_mood_expressed,median(number of preceding user turns),nan
1309ed25d6fa45beb74d0862d37289a5,0,marker_mood_expressed,min(number of preceding user turns),1
2cf4106e76a041dfb2d4f75bd6ae804b,0,marker_mood_expressed,min(number of preceding user turns),1
79e7604d2acc442d9f78e00b173786c9,0,marker_mood_expressed,min(number of preceding user turns),nan
:
```

stats-per-session.csv

```
sender_id,session_idx,marker,statistic,value
all,nan,-,total_number_of_sessions,3
aeshky marked this conversation as resolved.
Show resolved Hide resolved
all,nan,marker_cheer_up,number_of_sessions_where_marker_applies_at_least_once,0
all,nan,marker_cheer_up,percentage_of_sessions_where_marker_applies_at_least_once,0.0
all,nan,marker_mood_expressed,number_of_sessions_where_marker_applies_at_least_once,2
all,nan,marker_mood_expressed,percentage_of_sessions_where_marker_applies_at_least_once,66.667
all,nan,marker_mood_expressed_and_name_provided,number_of_sessions_where_marker_applies_at_least_once,0
all,nan,marker_mood_expressed_and_name_provided,percentage_of_sessions_where_marker_applies_at_least_once,0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really expect a nan as a session_id in the csv-file? If they're all nan, I'm wondering why the column wasn't omitted. Or is this happening only in a specific setting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a nan because trackers is "all" and session_idxs are ints so we shouldn't use "all" to indicate that this is over all sessions easily -- we could move those rows to another file to get rid of the column entirely, use "all" and accept that this column will be loaded as "str" in pandas, ... or use a different value in that column ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just need to explain what the nans mean here. I like having the same column names for both statistics output files. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the value column an int or a float? Seeing two different schemas here is a bit scary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will depend on the statistic calculated. For example, "the number of" was written as an int while "percentage" was written as a float. If you read it as a pandas dataframe they will all be treated as floats. I think it's fine as long as the column only contains numbers. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about the pandas assumption. Yes, pandas will deal with it in an expected way. But what about excel? Spark? Dask? Polars?

Each library typically has its own little quirk when it comes to casting assumptions and anything that we can proactively do to prevent assumptions from taking over may prevent a small disaster. There's a (pretty darn good) PyData Amsterdam talk titled "High Performance Data Loss" on the topic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create an issue for this and include this in the upcoming marker fixes, as it's more to do with the code than the docs. Thanks for raising it.

:
```

stats-overall.csv

## Configuring the CLI command
aeshky marked this conversation as resolved.
Show resolved Hide resolved

You can configure the marker extraction and statistics computation using the following arguments:

```
usage: rasa evaluate markers [-h] [-v] [-vv] [--quiet] [--config CONFIG]
[--no-stats | --stats-file-prefix [STATS_FILE_PREFIX]]
[--endpoints ENDPOINTS] [-d DOMAIN]
output_filename {first_n,sample,all} ...

positional arguments:
output_filename The filename to write the extracted markers to (CSV format).
{first_n,sample,all}
first_n Select trackers sequentially until N are taken.
sample Select trackers by sampling N.
aeshky marked this conversation as resolved.
Show resolved Hide resolved
all Select all trackers.

optional arguments:
-h, --help show this help message and exit
--config CONFIG The config file(s) containing marker definitions. This can be a single YAML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if --config is the best name here since it may cause confusion with config.yaml. Might it be possible to rename this to --markers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're considering this under this issue. I think it's fine to have different 'configurations' for differen things. I worry we'll end up using 'markers' to mean both the marker config and the marker output 😄

file, or a directory that contains several files with marker definitions in
it. The content of these files will be read and merged together. (default:
markers.yml)
--no-stats Do not compute summary statistics. (default: True)
--stats-file-prefix [STATS_FILE_PREFIX]
The common file prefix of the files where we write out the compute
statistics. More precisely, the file prefix must consist of a common path
plus a common file prefix, to which suffixes `-overall.csv` and `-per-
session.csv` will be added automatically. (default: stats)
--endpoints ENDPOINTS
Configuration file for the tracker store as a yml file. (default:
endpoints.yml)
-d DOMAIN, --domain DOMAIN
Domain specification. This can be a single YAML file, or a directory that
contains several files with domain specifications in it. The content of
these files will be read and merged together. (default: domain.yml)

Python Logging Options:
-v, --verbose Be verbose. Sets logging level to INFO. (default: None)
-vv, --debug Print lots of debugging statements. Sets logging level to DEBUG. (default:
None)
--quiet Be quiet! Sets logging level to WARNING. (default: None)usage: rasa evaluate markers [-h] [-v] [-vv] [--quiet] [--config CONFIG]
[--no-stats | --stats-file-prefix [STATS_FILE_PREFIX]]
[--endpoints ENDPOINTS] [-d DOMAIN]
output_filename {first_n,sample,all} ...

positional arguments:
output_filename The filename to write the extracted markers to (CSV format).
{first_n,sample,all}
first_n Select trackers sequentially until N are taken.
sample Select trackers by sampling N.
all Select all trackers.

optional arguments:
-h, --help show this help message and exit
--config CONFIG The config file(s) containing marker definitions. This can be a single YAML
file, or a directory that contains several files with marker definitions in
it. The content of these files will be read and merged together. (default:
markers.yml)
--no-stats Do not compute summary statistics. (default: True)
--stats-file-prefix [STATS_FILE_PREFIX]
The common file prefix of the files where we write out the compute
statistics. More precisely, the file prefix must consist of a common path
plus a common file prefix, to which suffixes `-overall.csv` and `-per-
session.csv` will be added automatically. (default: stats)
--endpoints ENDPOINTS
Configuration file for the tracker store as a yml file. (default:
endpoints.yml)
-d DOMAIN, --domain DOMAIN
Domain specification. This can be a single YAML file, or a directory that
contains several files with domain specifications in it. The content of
these files will be read and merged together. (default: domain.yml)

Python Logging Options:
-v, --verbose Be verbose. Sets logging level to INFO. (default: None)
-vv, --debug Print lots of debugging statements. Sets logging level to DEBUG. (default:
None)
--quiet Be quiet! Sets logging level to WARNING. (default: None)
```
aeshky marked this conversation as resolved.
Show resolved Hide resolved