This is a direct port of the work of @vhf on https://github.com/vhf/confusable_homoglyphs to clojure .
a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar wikipedia:Homoglyph
Unicode homoglyphs can be a nuisance on the web. Your most popular client, AlaskaJazz, might be upset to be impersonated by a trickster who deliberately chose the username ΑlaskaJazz.
AlaskaJazz
is single script: only Latin characters.ΑlaskaJazz
is mixed-script: the first character is a greek letter.
You might also want to avoid people being tricked into entering their
password on www.microsоft.com
or www.faϲebook.com
instead of
www.microsoft.com
or www.facebook.com
. Here is a
utility to play with
these confusable homoglyphs.
Not all mixed-script strings have to be ruled out though, you could only exclude mixed-script strings containing characters that might be confused with a character from some unicode blocks of your choosing.
Allo
andρττ
are fine: single script.AlloΓ
is fine when our preferred script alias is 'latin': mixed script, butΓ
is not confusable.Alloρ
is dangerous: mixed script andρ
could be confused withp
.
codox generated documentation.
The tests might help you getting started.
thorn is available on Clojars.
Add this to your dependencies:
Distributed under the Eclipse Public License, the same as Clojure.
Port of https://github.com/vhf/confusable_homoglyphs which is MIT-licensed