Skip to content

Commit

Permalink
no multiple diactritics rule, turned off by default
Browse files Browse the repository at this point in the history
  • Loading branch information
aborazmeh committed Jul 12, 2024
1 parent 554fbc3 commit f81e916
Show file tree
Hide file tree
Showing 3 changed files with 57 additions and 3 deletions.
12 changes: 10 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,14 @@ you can specify that option, fixing this will normalize the character

أهلًا وسهلًا

## No Multiple Diacritics on the Same Letter

This option is false by default, but you can turn it on to make sure every letter can take only one basic diacritic.

وَأَيوُّبَ وَيُوسَُِف": قُرأ يوسف بضم السين وكسرها وفتحها، وكلُّ هذه القراءات لغات، أفصحها ضمٌ السين.

This will yield error when the option is turned on.

## Usage

Via `.textlintrc.json`(Recommended)
Expand All @@ -89,7 +97,8 @@ These are default options, you can change them in your .textlintrc file
"no_shadda_with_sukun": true,
"no_duplicated_diacritics": true,
"only_fathatan_on_alef": true,
"fathatan_before_alef": true
"fathatan_before_alef": true,
"no_multiple_diactritics": false
}
}
}
Expand Down Expand Up @@ -123,7 +132,6 @@ Test textlint rule by [textlint-tester](https://github.com/textlint/textlint-tes
- No Haraka *and* Sukun on the same letter
- No Sukun on the first letter of the word
- No Madda with Hamza
- Option for no combining diacritics

## License

Expand Down
33 changes: 32 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,27 @@ export interface Options {
no_middle_tanween?: boolean;
only_fathatan_on_alef?: boolean;
fathatan_before_alef?: boolean;
no_multiple_diactritics?: boolean;
}

const fatha = "\u0618\u064E\u065E\u08E4\u08F4\u08F5\uFC60\uFCF2\uFE76\uFE77";
const damma = "\u0619\u064F\u0657\u065D\u08E3\u08E5\u08FE\uFC61\uFCF3\uFE78\uFE79";
const kasra = "\u061A\u0650\u08D8\u08D9\u08E6\u08F6\uFC62\uFCF4\uFE7A\uFE7B";
const sukun = "\u0652\u07B0\u082C\u08D0\u0AFA\uFE7E\uFE7F\u1123E";

const alefWithFathatan = "\uFD3C\uFD3D";
const fathatan = "\u064B\u08E7\u08F0\uFE70\uFE71";
const dammatan = "\u064C\u08E8\u08F1\uFC5E\uFE72";
const kasratan = "\u064D\u08E9\u08F2\uFC5F\uFE74";

const regex = {
basicDiacritics: `[${fatha}${damma}${kasra}${sukun}]`,
diacritics: "[\u064B-\u0653]",
tanween: `[${alefWithFathatan}${fathatan}${dammatan}${kasratan}]`,
shadda: "[\u0651\u0AFB\uFC5E-\uFC63\uFCF2-\uFCF4\uFE7C\uFE7D\u11237]",
madda: "\u0653",
alefMadda: "[\u0622\uFE81\uFE82\uFEF5\uFEF6]",
sukun: "[\u0652\u07B0\u082C\u08D0\u0AFA\uFE7E\uFE7F\u1123E]",
sukun: `[${sukun}]`,
alef: "[\u0627\uFE8D\uFE8E\u1EE00]"
};

Expand Down Expand Up @@ -479,6 +486,25 @@ function fathatanBeforeAlef(node: TxtStrNode, text: string, context: Readonly<Te
}
}

function noMultipleDiactritics(node: TxtStrNode, text: string, context: Readonly<TextlintRuleContext>) {
const { report, locator, RuleError } = context;

const matches = text.matchAll(
new RegExp(
`(\\p{Letter}|${regex.shadda})${regex.basicDiacritics}(${regex.basicDiacritics}|${regex.shadda})*?${regex.basicDiacritics}`,
"ug"
)
);
for (const match of matches) {
const index = match.index ?? 0;
const matchRange = [index, index + match[0].length] as const;
const ruleError = new RuleError("Found multiple diactritic on the same letter.", {
padding: locator.range(matchRange)
});
report(node, ruleError);
}
}

const report: TextlintRuleModule<Options> = (context, options = {}) => {
const { getSource, Syntax } = context;
return {
Expand All @@ -491,6 +517,7 @@ const report: TextlintRuleModule<Options> = (context, options = {}) => {
const noMiddleTanweenOpt = options.no_middle_tanween ?? true;
const onlyFathatanOnAlefOpt = options.only_fathatan_on_alef ?? true;
const fathatanBeforeAlefOpt = options.fathatan_before_alef ?? true;
const noMultipleDiactriticsOpt = options.no_multiple_diactritics ?? false;

const text = getSource(node); // Get text

Expand Down Expand Up @@ -523,6 +550,10 @@ const report: TextlintRuleModule<Options> = (context, options = {}) => {
if (fathatanBeforeAlefOpt) {
fathatanBeforeAlef(node, text, context);
}

if (noMultipleDiactriticsOpt) {
noMultipleDiactritics(node, text, context);
}
}
};
};
Expand Down
15 changes: 15 additions & 0 deletions test/index-test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ tester.run("rule", rule, {
"الآن",
"ضيّق",
"يونَُِس: قال أبو عبيدة، «يقال:يونس بضم النون وكسرها». والمشهور في القراءة يونُس برفع النون من غير همز.",
`"وَأَيوُّبَ وَيُوسَُِف": قُرأ يوسف بضم السين وكسرها وفتحها، وكلُّ هذه القراءات لغات، أفصحها ضمٌ السين.`,
{
text: "حركةٌ وشدَّةٌ ليستا حركتان على الحرف نفسه",
options: { no_multiple_diactritics: true }
},
{
text: "أهلاً وسهلاً",
options: {
Expand Down Expand Up @@ -196,6 +201,16 @@ tester.run("rule", rule, {
message: "Found Fathatan on Alef."
}
]
},
{
text: `"وَأَيوُّبَ وَيُوسَُِف": قُرأ يوسف بضم السين وكسرها وفتحها، وكلُّ هذه القراءات لغات، أفصحها ضمٌ السين.`,
options: { no_multiple_diactritics: true },
errors: [
{
message: "Found multiple diactritic on the same letter.",
range: [17, 20]
}
]
}
]
});

0 comments on commit f81e916

Please sign in to comment.