An elixir library to do approximate/fuzzy string matching.
If available in Hex, the package can be installed
by adding peach
to your list of dependencies in mix.exs
:
def deps do
[
{:peach, "~> 0.2.0"}
]
end
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/peach.
To test run mix test
.
To test with CSV data, create a folder in the /test/
folder called function_test_data
and put the following CSVs in them:
normalise_whitespace.csv
remove_punc.csv
pre_process.csv
remove_emojis.csv
normalise_text.csv
replace_punc.csv
get_brief.csv
remove_numbers.csv
then run mix test
or mix test.watch test/peach_test.exs --max-failures=1 --seed=0
Below are some examples of how Peach might be used to do the type of fuzzy matching automation required in the first tier of a menu centred chatbot.
input = "2.)"
keyword_set = MapSet.new(["1", "2", "menu"])
matches =
Peach.pre_process(input)
|> Peach.find_exact_match(keyword_set)
assert matches == "2"
input = "_menu_"
keyword_set = MapSet.new(["1", "2", "menu"])
matches =
Peach.pre_process(input)
|> Peach.find_exact_match(keyword_set)
assert matches == "menu"
input = "menuu"
keyword_set = MapSet.new(["menu", "optin", "optout"])
threshold = 1
matches =
Peach.pre_process(input)
|> Peach.find_fuzzy_matches(keyword_set, threshold)
assert matches == [{"menu", 1}]
input = "optint"
keyword_threshold_set = MapSet.new([{"menu", 1}, {"optin", 2}, {"optout", 2}])
matches =
Peach.pre_process(input)
|> Peach.find_fuzzy_matches(keyword_threshold_set)
assert matches == [{"optin", 1}, {"optout", 2}]