From f7f71961a336e3b43826323819a7a534c6bf6c2e Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Thu, 5 Dec 2024 06:43:27 -0800 Subject: [PATCH] Fix egregious typo --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 7690cd1..28f0237 100644 --- a/index.html +++ b/index.html @@ -91,7 +91,7 @@

Terminology

Corpus The natural language text contained by a document or set of documents which the user would like to search.

Segmentation The process of breaking natural language text up into distinct words and phrases. This often includes operations such as "named entity recognition" (such as recognizing that the three word sequence Dr. Jonas Salk is a person's name).

-

Stemming A process or operation that reduces words to their "stem" or root. For example, the words runs, ran, and running all share the stem run. This some sometimes called (more formally) lemmatization and the stem is sometimes called the lemma.

+

Stemming A process or operation that reduces words to their "stem" or root. For example, the words runs, ran, and running all share the stem run. This is sometimes called (more formally) lemmatization and the stem is sometimes called the lemma.

Full-Text Search refers to searches that process the entire contents of the textual document or set of documents. Full-text queries perform linguistic searches against text data in full-text indexes by operating on words and phrases based on the rules of a particular language such as English or Japanese. Full-text queries can include simple words and phrases or multiple forms of a word or phrase.

Frequently this means that a full-text search employs indexes and natural language processing. When you are using a search engine, you are using a form of full text search. Full text search often breaks natural language text into words or phrases (this is called segmentation) and may apply complex processing to get at the semantic "root" values of words (this is called stemming). These processes are sensitive to language, context, and many other aspects of textual variation.