Skip to content

Conversation

@jrodewig
Copy link
Contributor

Reformats the CJK bigram and CJK width token filter docs:

  • Adds a title abbreviation
  • Updates the description with a short example and Lucene link
  • Adds an analyze API example with resulting tokens
  • Adds or updates an example adding the token filter to an analyzer
  • Updates the parameter docs and custom token filter example

I hope to re-use this format for other token filter docs. All feedback is welcome!

@jrodewig jrodewig added >docs General docs changes :Search Relevance/Analysis How text is split into tokens v8.0.0 v7.5.0 v7.6.0 v7.4.2 labels Oct 17, 2019
@jrodewig jrodewig requested a review from romseygeek October 17, 2019 19:11
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Analysis)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (>docs)

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment but it's a great improvement over all. LGTM.

Comment on lines 7 to 10
Forms https://en.wikipedia.org/wiki/Bigram[bigrams] out of the CJK (Chinese,
Japanese, and Korean) terms generated by the
<<analysis-standard-tokenizer,standard tokenizer>> or the
{plugins}/analysis-icu-tokenizer.html[ICU tokenizer].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, it will form bigrams from the CJK tokens produced by any tokenizer, so I'm not sure we need to refer to standard and icu here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @romseygeek. I removed the standard and ICU reference with cecd9bc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants