Skip to content

Commit 21f362a

Browse files
authored
[DOCS] Add a lowercase email example to keyword tokenizer docs (#53257)
1 parent 374e76d commit 21f362a

File tree

1 file changed

+48
-0
lines changed

1 file changed

+48
-0
lines changed

docs/reference/analysis/tokenizers/keyword-tokenizer.asciidoc

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,54 @@ The above sentence would produce the following term:
4444
[ New York ]
4545
---------------------------
4646

47+
[discrete]
48+
[[analysis-keyword-tokenizer-token-filters]]
49+
=== Combine with token filters
50+
You can combine the `keyword` tokenizer with token filters to normalise
51+
structured data, such as product IDs or email addresses.
52+
53+
For example, the following <<indices-analyze,analyze API>> request uses the
54+
`keyword` tokenizer and <<analysis-lowercase-tokenfilter,`lowercase`>> filter to
55+
convert an email address to lowercase.
56+
57+
[source,console]
58+
---------------------------
59+
POST _analyze
60+
{
61+
"tokenizer": "keyword",
62+
"filter": [ "lowercase" ],
63+
64+
}
65+
---------------------------
66+
67+
/////////////////////
68+
69+
[source,console-result]
70+
----------------------------
71+
{
72+
"tokens": [
73+
{
74+
"token": "[email protected]",
75+
"start_offset": 0,
76+
"end_offset": 22,
77+
"type": "word",
78+
"position": 0
79+
}
80+
]
81+
}
82+
----------------------------
83+
84+
/////////////////////
85+
86+
87+
The request produces the following token:
88+
89+
[source,text]
90+
---------------------------
91+
92+
---------------------------
93+
94+
4795
[float]
4896
=== Configuration
4997

0 commit comments

Comments
 (0)