Skip to content
/ thorn Public

Homoglyph/IDN homograph detection/handling - Clojure port of python confusable_homoglyphs

Notifications You must be signed in to change notification settings

mpenet/thorn

Repository files navigation

Τһогɴ

Build Status

This is a direct port of the work of @vhf on https://github.com/vhf/confusable_homoglyphs to clojure .

a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar wikipedia:Homoglyph

Unicode homoglyphs can be a nuisance on the web. Your most popular client, AlaskaJazz, might be upset to be impersonated by a trickster who deliberately chose the username ΑlaskaJazz.

  • AlaskaJazz is single script: only Latin characters.
  • ΑlaskaJazz is mixed-script: the first character is a greek letter.

You might also want to avoid people being tricked into entering their password on www.microsоft.com or www.faϲebook.com instead of www.microsoft.com or www.facebook.com. Here is a utility to play with these confusable homoglyphs.

Not all mixed-script strings have to be ruled out though, you could only exclude mixed-script strings containing characters that might be confused with a character from some unicode blocks of your choosing.

  • Allo and ρττ are fine: single script.
  • AlloΓ is fine when our preferred script alias is 'latin': mixed script, but Γ is not confusable.
  • Alloρ is dangerous: mixed script and ρ could be confused with p.

Documentation

codox generated documentation.

The tests might help you getting started.

Installation

thorn is available on Clojars.

Add this to your dependencies:

Clojars Project

License

Distributed under the Eclipse Public License, the same as Clojure.

Port of https://github.com/vhf/confusable_homoglyphs which is MIT-licensed

About

Homoglyph/IDN homograph detection/handling - Clojure port of python confusable_homoglyphs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published