-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[DOCS]Add a lower-casing eamil address example to Keyword Tokenizer #53257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS]Add a lower-casing eamil address example to Keyword Tokenizer #53257
Conversation
|
Pinging @elastic/es-search (:Search/Analysis) |
|
Pinging @elastic/es-docs (>docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the spacing to be more consistent with our other examples.
Let's also use example.com as the placeholder domain.
| "filter": ["lowercase"], | |
| "text": "john.SMITH@global-international.COM" | |
| "filter": [ "lowercase" ], | |
| "text": "john.SMITH@example.COM" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "token": "john.smith@global-international.com", | |
| "start_offset": 0, | |
| "end_offset": 35, | |
| "token": "john.smith@example.com", | |
| "start_offset": 0, | |
| "end_offset": 22, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [ john.smith@global-inetrnational.com ] | |
| [ john.smith@example.com ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The above sentence would produce the following term: | |
| The request produces the following token: |
jrodewig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @jureaky.
The snippet looks good. I left some suggestions to fix typos and add some text to provide context.
I'd like to take another look after you've had a chance to go through those.
|
@elasticmachine test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like GIthub ate my original suggestion. I'd add some lead-in text here to provide context. The original heading contained a typo.
| [float] | |
| === Exmaple output(Lower-casing email address) | |
| [discrete] | |
| [[analysis-keyword-tokenizer-token-filters]] | |
| === Combine with token filters | |
| You can combine the `keyword` tokenizer with token filters to normalise | |
| structured data, such as product IDs or email addresses. | |
| For example, the following <<indices-analyze,analyze API>> request uses | |
| `keyword` tokenizer and <<analysis-lowercase-tokenfilter,`lowercase`>> filter to | |
| convert an email address to lowercase. |
353b743 to
e6059d6
Compare
e6059d6 to
519f82d
Compare
|
@jrodewig I changed all you suggested and force-pushed after rebase... PTAL ! |
jrodewig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @jureaky. I'll get this merged and backported.
|
@elasticmachine test this please |
While reading a document for
Keyword Tokenizer, I felt a bit confused whyKeyword Tokenizeris needed.Keyword Tokenizeris too simple to explain, which makes an example also simple.However, as the document stated the other use-case(lower-casing email address) for users
to better understand the necessity of
Keyword Tokenizer, I added the corresponding example.This is AS-IS:
This is Added Example: