Skip to content

英文全匹配配置未生效 #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hangker1997 opened this issue Jul 30, 2024 · 2 comments
Closed

英文全匹配配置未生效 #69

hangker1997 opened this issue Jul 30, 2024 · 2 comments

Comments

@hangker1997
Copy link

我的配置类是, 采用黑白名单的方式控制敏感词, 只用到了黑名单, 黑名单里 全是英文的, 开启了这个全匹配配置
后发现还是不生效, 比如 黑名单里有cp, 现在cpm是合法的,但还是被校验住了,不知道咋回事, 我用的版本是 0.14.0
@bean
public SensitiveWordBs sensitiveWordBs() {
return SensitiveWordBs.newInstance()
.wordAllow(WordAllows.chains(WordAllows.defaults(), myDdWordAllow))
.wordDeny(myDdWordDeny)
//英文全匹配
.wordResultCondition(WordResultConditions.englishWordMatch())
// 各种其他配置
//不忽略全角和半角 中英文括号
.ignoreWidth(false)
.init();
}

@hangker1997
Copy link
Author

已解决,问题原因是全匹配1位的话不好用 就像cp是敏感词,cpm还是会校验为敏感词,但是cp后面跟两位以上就不会被校验敏感词
我重写AbstractWordResultCondition的doMatch方法就解决了
import com.github.houbb.heaven.util.lang.CharUtil;
import com.github.houbb.sensitive.word.api.IWordContext;
import com.github.houbb.sensitive.word.api.IWordResult;
import com.github.houbb.sensitive.word.constant.enums.WordValidModeEnum;
import com.github.houbb.sensitive.word.support.resultcondition.AbstractWordResultCondition;

public class EnglishWordMatch extends AbstractWordResultCondition {

@Override
protected boolean doMatch(IWordResult wordResult, String text, WordValidModeEnum modeEnum, IWordContext context) {
    final int startIndex = wordResult.startIndex();
    final int endIndex = wordResult.endIndex();

    // 判断前一个字符是否为英文。如果是,则不满足
    if(startIndex > 0) {
        char preC = text.charAt(startIndex - 1);
        if(CharUtil.isEnglish(preC)) {
            return false;
        }
    }

    // 判断后一个字符是否为英文
    if(endIndex < text.length()) {
        char afterC = text.charAt(endIndex);
        if(CharUtil.isEnglish(afterC)) {
            return false;
        }
    }

    // 判断当前是否为英文单词
    for(int i = startIndex; i < endIndex; i++) {
        char c = text.charAt(i);
        if(!CharUtil.isEnglish(c)) {
            return true;
        }
    }

    return true;
}

}

@houbb
Copy link
Owner

houbb commented Aug 28, 2024

感谢提醒,v0.19.1 版本已修正。后续这种优化可以提 PR,我来统一合并。

@houbb houbb closed this as completed Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants