Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji is not supported when a profanity is found next to that character #71

Closed
2 of 5 tasks
rion18 opened this issue Aug 2, 2024 · 2 comments
Closed
2 of 5 tasks
Labels
bug Something isn't working

Comments

@rion18
Copy link
Contributor

rion18 commented Aug 2, 2024

Expected behavior

Using obscenity to censor a string containing an emoji, like this one: 🤣bummer, and a dataset that contains the word bummer.

Using this strategy,

const CENSOR_STRATEGY = (censorContext) => ''.repeat(censorContext.matchLength);

for removing the profanities,

The expected output would be 🤣.

Actual behavior

Instead, the output is this: 🤣b. It matches the word bummer correctly, BUT when the matcher tries to find the matches, there's an error in the index.

Minimal reproducible example

const {
  englishDataset,
  parseRawPattern,
  DataSet,
  RegExpMatcher,
} = require('obscenity');

const data = new DataSet()
    .addAll(englishDataset)
    .addPhrase(phrase => 
      phrase
        .setMetadata({ originalWord: 'bummer' })
        .addPattern(parseRawPattern('bummer'))
    ).build();

const matcher = new RegExpMatcher({
    ...profanityDataset, // no transformers
  });

const stringBummer = '🤣bummer';
if (matcher.hasMatch(stringBummer)) {
  const matches = matcher.getAllMatches(stringBummer, true);
  return textCensor.applyTo(stringBummer, matches);
}
return stringBummer;

Steps to reproduce

No response

Additional context

No response

Node.js version

18.17.1

Obscenity version

0.3.1

Priority

  • Low
  • Medium
  • High

Terms

  • I agree to follow the project's Code of Conduct.
  • I have searched existing issues for similar reports.
@rion18 rion18 added the bug Something isn't working label Aug 2, 2024
@jo3-l
Copy link
Owner

jo3-l commented Aug 2, 2024

Thanks for the short repro. I think I know what the issue is and will take a stab at fixing it today.

@jo3-l jo3-l closed this as completed in 3a49579 Aug 2, 2024
@jo3-l
Copy link
Owner

jo3-l commented Aug 2, 2024

Fix released in v0.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants