Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ar] Fix some bugs and add new rules #6785

Closed
wants to merge 11 commits into from

Conversation

sohaibafifi
Copy link
Contributor

@sohaibafifi sohaibafifi commented Jun 14, 2022

Fix sohaibafifi#19 and add many new rules:

  • ArabicTransVerbDirectToIndirectRule
  • ArabicTransVerbIndirectToDirectRule
  • ArabicTransVerbIndirectToIndirectRule
  • ArabicInflectedOneWordReplaceRule

TODO:

  • Cleanup commented code
  • Update Copyright headers
  • Remove duplicated code

@sohaibafifi sohaibafifi changed the title [ar] Fix https://github.com/sohaibafifi/languagetool/issues/19 and ad… [ar] Fix sohaibafifi#19 and add many new rules Jun 14, 2022
@sohaibafifi sohaibafifi changed the title [ar] Fix sohaibafifi#19 and add many new rules [ar] Fix some bugs add many new rules Jun 14, 2022
@sohaibafifi sohaibafifi marked this pull request as ready for review June 14, 2022 19:08
@sohaibafifi sohaibafifi changed the title [ar] Fix some bugs add many new rules [ar] Fix some bugs and add new rules Jun 15, 2022
@sohaibafifi
Copy link
Contributor Author

Hello @danielnaber,

Can you please review this PR before the next version freeze?

Thank you

try {
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1;
} catch (NumberFormatException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't print stacktraces like this - if this is an internal bug, the exception should be thrown. Otherwise, it should be logged with the logging framework we use elsewhere.

// generate suggestion
List<String> suggestionList = prepareSuggestions(compList, noun);
for (String sug : suggestionList) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty line


/* prepare suggesiyton for a list of comparative */
protected static List<String> prepareSuggestions(List<String> compList, String noun) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty lines


/* test if the word is an isolated pronoun */
private static boolean isPronoun(String word) {
if (word == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use {..} even for single line if statements

return isolatedToAttachedPronoun.getOrDefault(word, "");
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty line

return "ARABIC_COMMA_PARENTHESIS_WHITESPACE2";
}

//@Override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this commented out?

import static java.lang.Math.min;

/**
* TODO : Sohaib - check the duplicated code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be done before merging the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, many codes come from the English part. I was thinking of a way to reduce such duplications.  

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please clean up the duplication before we merge the PR.

}

private boolean isCandidateVerb(AnalyzedToken mytoken) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty line


private boolean isCandidateVerb(AnalyzedToken mytoken) {

if (getSuggestedPreposition(mytoken) != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{...}

try {
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1;
} catch (NumberFormatException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log or throw

@danielnaber
Copy link
Member

@sohaibafifi I've started a review, but I see a lot of TODO and FIXME. Are you sure the code is ready to be merged? How have you tested it (other than having unit tests)? Please also make sure the code has no random empty lines etc and complies at https://dev.languagetool.org/code-style. In the future, it would be better to have several smaller PRs, as reviewing such a massive change set is a lot of work.

@sohaibafifi sohaibafifi requested a review from danielnaber June 21, 2022 08:56
Copy link
Member

@danielnaber danielnaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure all the formal issues (CamelCase instead of under_scores, indent, {} for single-line if-statements etc.) are cleaned up first

try {
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1;
} catch (NumberFormatException e) {
throw new RuntimeException("Error parsing adj_pos from : " + arguments.get("adj_pos"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too much indent

for (AnalyzedToken wordTok : token.getReadings()) {
// test if the first token is a to replace word
boolean is_candidate_word = isCandidateWord(wordTok);
if (is_candidate_word) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use CamelCase

if (propositionsWithMessage != null) {
propositions = Arrays.asList(propositionsWithMessage.getSuggestion().split("\\|"));
sug_msg = propositionsWithMessage.getMessage();
sug_msg = sug_msg != null ? sug_msg : "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CamelCase

@sohaibafifi sohaibafifi requested a review from danielnaber June 21, 2022 21:24
Copy link
Member

@danielnaber danielnaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this won't make it into LT 5.8 anymore, but we can merge it after the release once the remaining issues get cleaned up. Please make sure there's no duplication, each class should have a test (or have its test updated), and there should be no random empty lines that are not useful to structure the code (e.g. an empty line before } is rarely useful).

try {
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1;
} catch (NumberFormatException e) {
throw new RuntimeException("Error parsing adj_pos from : " + arguments.get("adj_pos"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use e as the second argument here, so no information is lost.

return newMatch;
}

/* prepare suggesiyton for a list of comparative */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "suggesiyton"

super.setCategory(Categories.MISC.getCategory(messages));
setLocQualityIssueType(ITSIssueType.Inconsistency);
addExamplePair(Example.wrong("أجريت <marker>أبحاثا</marker> في المخبر"),
Example.fixed("أجريت <marker>بحوثا</marker> في المخبر."));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please indent so that the Examples are aligned

/**
* Load words, normalized to lowercase unless starting with '*'.
*/
private static Set<String> loadWords(String path) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated from AvsAnData, can this be moved to a Tools method?

String postag = tk.getPOSTag();
if (tagmanager.isNoun(postag) && !tagmanager.isDefinite(postag) && !tagmanager.hasPronoun(postag)) {
// add inflection flag
if (inflection == "jar") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does == really work here or is equals needed?

nextPos = size + nextPos;
}
} catch (NumberFormatException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really print?

} catch (StringIndexOutOfBoundsException e) {
int pos = getFlagPos(postag, flagType);
// debug only
System.out.println("ArabicTagmanager:Exception: pos flag" + Integer.toString(pos) + "flagtype:" + flagType + " postag:" + tmp + " len:" + Integer.toString(tmp.length()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be logging or exception, but not System.out.println

newposTag = setFlag(newposTag, "CONJ", '-');
newposTag = setFlag(newposTag, "JAR", '-');
// if the word is a definated word, set the definition flag
if (isDefinite(postag))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{...}

}
// show only no suggestion cases
if (!suggestion.startsWith("[ي")) {
System.out.println("مثال: " + word1 + " " + word2 + " مقترح:" + suggestion);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are usually not System.out.println in tests other than those that print progress. Is this an error? Then the test should fail.

@danielnaber
Copy link
Member

Is this ready to be reviewed now?

@sohaibafifi sohaibafifi marked this pull request as draft July 6, 2022 13:24
@sohaibafifi sohaibafifi marked this pull request as ready for review December 18, 2022 10:03
@danielnaber
Copy link
Member

@sohaibafifi It seems not all my comments from https://github.com/languagetool-org/languagetool/pull/6785/files were addressed. Also, could you please turn this into smaller PRs? This one is too large to be reviewed. There should be one PR per new rule, for example, and whitespace changes should be a separate commit.

@sohaibafifi
Copy link
Contributor Author

sohaibafifi commented Dec 18, 2022

I’m very sorry, I'm afraid it's going to be a little complicated this time. we'll be careful the next time.

@sohaibafifi
Copy link
Contributor Author

Closed and replaced by #8286, #8287 and #8288.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

الحركات تغير مواضع الكلمات
3 participants