-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ar] Fix some bugs and add new rules #6785
Conversation
Hello @danielnaber, Can you please review this PR before the next version freeze? Thank you |
try { | ||
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1; | ||
} catch (NumberFormatException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't print stacktraces like this - if this is an internal bug, the exception should be thrown. Otherwise, it should be logged with the logging framework we use elsewhere.
// generate suggestion | ||
List<String> suggestionList = prepareSuggestions(compList, noun); | ||
for (String sug : suggestionList) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line
|
||
/* prepare suggesiyton for a list of comparative */ | ||
protected static List<String> prepareSuggestions(List<String> compList, String noun) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty lines
|
||
/* test if the word is an isolated pronoun */ | ||
private static boolean isPronoun(String word) { | ||
if (word == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use {..} even for single line if
statements
return isolatedToAttachedPronoun.getOrDefault(word, ""); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line
return "ARABIC_COMMA_PARENTHESIS_WHITESPACE2"; | ||
} | ||
|
||
//@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this commented out?
import static java.lang.Math.min; | ||
|
||
/** | ||
* TODO : Sohaib - check the duplicated code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be done before merging the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, many codes come from the English part. I was thinking of a way to reduce such duplications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please clean up the duplication before we merge the PR.
} | ||
|
||
private boolean isCandidateVerb(AnalyzedToken mytoken) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty line
|
||
private boolean isCandidateVerb(AnalyzedToken mytoken) { | ||
|
||
if (getSuggestedPreposition(mytoken) != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{...}
try { | ||
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1; | ||
} catch (NumberFormatException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log or throw
@sohaibafifi I've started a review, but I see a lot of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure all the formal issues (CamelCase instead of under_scores, indent, {} for single-line if-statements etc.) are cleaned up first
try { | ||
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1; | ||
} catch (NumberFormatException e) { | ||
throw new RuntimeException("Error parsing adj_pos from : " + arguments.get("adj_pos")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too much indent
for (AnalyzedToken wordTok : token.getReadings()) { | ||
// test if the first token is a to replace word | ||
boolean is_candidate_word = isCandidateWord(wordTok); | ||
if (is_candidate_word) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use CamelCase
if (propositionsWithMessage != null) { | ||
propositions = Arrays.asList(propositionsWithMessage.getSuggestion().split("\\|")); | ||
sug_msg = propositionsWithMessage.getMessage(); | ||
sug_msg = sug_msg != null ? sug_msg : ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CamelCase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this won't make it into LT 5.8 anymore, but we can merge it after the release once the remaining issues get cleaned up. Please make sure there's no duplication, each class should have a test (or have its test updated), and there should be no random empty lines that are not useful to structure the code (e.g. an empty line before }
is rarely useful).
try { | ||
adjTokenIndex = Integer.valueOf(arguments.get("adj_pos")) - 1; | ||
} catch (NumberFormatException e) { | ||
throw new RuntimeException("Error parsing adj_pos from : " + arguments.get("adj_pos")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use e
as the second argument here, so no information is lost.
return newMatch; | ||
} | ||
|
||
/* prepare suggesiyton for a list of comparative */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "suggesiyton"
super.setCategory(Categories.MISC.getCategory(messages)); | ||
setLocQualityIssueType(ITSIssueType.Inconsistency); | ||
addExamplePair(Example.wrong("أجريت <marker>أبحاثا</marker> في المخبر"), | ||
Example.fixed("أجريت <marker>بحوثا</marker> في المخبر.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please indent so that the Example
s are aligned
...uage-modules/ar/src/main/java/org/languagetool/rules/ar/ArabicPunctuationWhitespaceRule.java
Outdated
Show resolved
Hide resolved
/** | ||
* Load words, normalized to lowercase unless starting with '*'. | ||
*/ | ||
private static Set<String> loadWords(String path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated from AvsAnData, can this be moved to a Tools method?
String postag = tk.getPOSTag(); | ||
if (tagmanager.isNoun(postag) && !tagmanager.isDefinite(postag) && !tagmanager.hasPronoun(postag)) { | ||
// add inflection flag | ||
if (inflection == "jar") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ==
really work here or is equals
needed?
nextPos = size + nextPos; | ||
} | ||
} catch (NumberFormatException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really print?
} catch (StringIndexOutOfBoundsException e) { | ||
int pos = getFlagPos(postag, flagType); | ||
// debug only | ||
System.out.println("ArabicTagmanager:Exception: pos flag" + Integer.toString(pos) + "flagtype:" + flagType + " postag:" + tmp + " len:" + Integer.toString(tmp.length())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be logging or exception, but not System.out.println
newposTag = setFlag(newposTag, "CONJ", '-'); | ||
newposTag = setFlag(newposTag, "JAR", '-'); | ||
// if the word is a definated word, set the definition flag | ||
if (isDefinite(postag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{...}
} | ||
// show only no suggestion cases | ||
if (!suggestion.startsWith("[ي")) { | ||
System.out.println("مثال: " + word1 + " " + word2 + " مقترح:" + suggestion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are usually not System.out.println
in tests other than those that print progress. Is this an error? Then the test should fail.
Is this ready to be reviewed now? |
@sohaibafifi It seems not all my comments from https://github.com/languagetool-org/languagetool/pull/6785/files were addressed. Also, could you please turn this into smaller PRs? This one is too large to be reviewed. There should be one PR per new rule, for example, and whitespace changes should be a separate commit. |
I’m very sorry, I'm afraid it's going to be a little complicated this time. we'll be careful the next time. |
Fix sohaibafifi#19 and add many new rules:
TODO: