Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Mixed Case false positives #1430

Merged
merged 11 commits into from
Jun 11, 2023
Merged

Conversation

briandonahue
Copy link
Collaborator

@briandonahue briandonahue commented May 3, 2023

Summary:

Should close #1402. False positives when warning for Mixed Case values such as when numbers and letters are mixed.

Expected behavior:

Values with a letter and numbers such as A102 302-B and those with only numbers like 321 should not trigger a warning.

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1426 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1426 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 0 out of 1426 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Warnings: 272 out of 1426 datasets (~19%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1426 sources (~0 %) are corrupted.
Commit: 9e96ae0
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

{"ROUTE 1", false},
{"route 1 Boulevard", false},
{"Another bad value", false},
{"MixedCaseButSingleWord", false},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to add/suggest more test values.

@github-actions
Copy link
Contributor

❌ Invalid acceptance test.
New Errors: 0 out of 1427 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1427 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 134 out of 1427 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 37 out of 1427 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1427 sources (~0 %) are corrupted.
Commit: aa1e457
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@github-actions
Copy link
Contributor

❌ Invalid acceptance test.
New Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 134 out of 1428 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 37 out of 1428 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1428 sources (~0 %) are corrupted.
Commit: 30f09aa
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@briandonahue briandonahue force-pushed the issue/1402/mixed_case_numbers branch from a496c24 to 2c602a8 Compare May 24, 2023 14:36
@briandonahue briandonahue marked this pull request as ready for review May 24, 2023 14:39
@briandonahue briandonahue self-assigned this May 24, 2023
@github-actions
Copy link
Contributor

❌ Invalid acceptance test.
New Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 134 out of 1428 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 37 out of 1428 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1428 sources (~0 %) are corrupted.
Commit: f70a7e7
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@briandonahue
Copy link
Collaborator Author

@davidgamez I've seen this test fail a few times on different PRs - let me know if there is anything I can do / look for to prevent or fix when this happens.

@davidgamez
Copy link
Member

davidgamez commented May 24, 2023

@davidgamez I've seen this test fail a few times on different PRs - let me know if there is anything I can do / look for to prevent or fix when this happens.

Hi @briandonahue I think the test is doing what is expected to do. In this PR, as in others, there are changes made to the notices logic then the threshold of new/dropped notices is greater than 1%. What we can do is review the test results and make sure the changes that we are planning make sense. You can download the reports and review the example notices.

@davidgamez
Copy link
Member

@briandonahue I was looking into the acceptance test reports, and it seems that some of the new warnings are coming from the use of full statements in the fields.
Example:

File/field: stop.txt/stop_desc
Value: Book online or call ... for pickup.

@isabelle-dr, do you think we should trigger the mixed case warning rule in the case of stop.txt/stop_desc?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

❌ Invalid acceptance test.
New Errors: 56 out of 1428 datasets (~4%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Errors: 14 out of 1428 datasets (~1%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 133 out of 1428 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 92 out of 1428 datasets (~6%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1428 sources (~0 %) are corrupted.
Commit: f70a7e7
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@isabelle-dr
Copy link
Contributor

@briandonahue and @davidgamez I am surprised by the acceptance test results in this PR. It has new errors and dropped errors, which is surprising since the mixed case check is a warning.

Is this because it's out of date with master?

@isabelle-dr
Copy link
Contributor

To @davidgamez's question

do you think we should trigger the mixed case warning rule in the case of stop.txt/stop_desc?

I believe the answer is yes, but I am not entirely sure. @tzujenchanmbd, is stop_desc user-facing or is it mostly used between producers and consumers?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1428 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 133 out of 1428 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 38 out of 1428 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1428 sources (~0 %) are corrupted.
Commit: e15cde4
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@briandonahue
Copy link
Collaborator Author

Is this because it's out of date with master?

@isabelle-dr I merged master and it's still having issues. I also apparently closed the PR earlier - no idea how I did that... re-opened.

Might need @davidgamez's help with the test errors...

@briandonahue briandonahue reopened this Jun 2, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 133 out of 1430 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 38 out of 1430 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1430 sources (~0 %) are corrupted.
Commit: b5690c4
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@davidgamez
Copy link
Member

To @davidgamez's question

do you think we should trigger the mixed case warning rule in the case of stop.txt/stop_desc?

I believe the answer is yes, but I am not entirely sure. @tzujenchanmbd, is stop_desc user-facing or is it mostly used between producers and consumers?

We have sentences in the stop.txt/stop_desc field example: Book online or call ... for pickup This implementation looks for every word in the sentence to have mixed cases. This is why we have a lot of new warnings. We have two options;

  • Drop the rule for the stop.txt/stop_desc.
  • Re-think how the rules behave for sentences.

@isabelle-dr
Copy link
Contributor

After a conversation during the contributor meeting, we've decided to go with

Drop the rule for the stop.txt/stop_desc.

@isabelle-dr
Copy link
Contributor

@briandonahue, would you be able to review the acceptance test report to evaluate if this PR removed most of the false positives, or if they still constitute a big portion?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 6, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 128 out of 1430 datasets (~9%) are invalid due to code change, which is above the provided threshold of 1%.
Dropped Warnings: 45 out of 1430 datasets (~3%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1430 sources (~0 %) are corrupted.
Commit: 802c8ed
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: ./gradlew goJF.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: ./gradlew goJF.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 3 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Warnings: 563 out of 1430 datasets (~39%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1430 sources (~0 %) are corrupted.
Commit: fa2bdd4
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

Copy link
Contributor

@isabelle-dr isabelle-dr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we designed the acceptance tests a while back, this is exactly what we had in mind for how it would be useful :)
I had a quick look at the acceptance test report, this change looks good to me, the number of false positives triggered by this notice dropped significantly with the latest changes.

Thank you for the good work @briandonahue and @davidgamez, I am glad we found a way to retain this notice in the validator!

@isabelle-dr
Copy link
Contributor

The last step is a code review by one of the core developers! cc @davidgamez

@github-actions
Copy link
Contributor

github-actions bot commented Jun 9, 2023

This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: ./gradlew goJF.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 9, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 3 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Warnings: 563 out of 1430 datasets (~39%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1430 sources (~0 %) are corrupted.
Commit: 34a9c44
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 9, 2023

❌ Invalid acceptance test.
New Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
New Warnings: 3 out of 1430 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Warnings: 563 out of 1430 datasets (~39%) are invalid due to code change, which is above the provided threshold of 1%.
0 out of 1430 sources (~0 %) are corrupted.
Commit: 846c785
Download the full acceptance test report here (report will disappear after 90 days).
❌ Invalid acceptance test.

@davidgamez davidgamez merged commit eb54951 into master Jun 11, 2023
@davidgamez davidgamez deleted the issue/1402/mixed_case_numbers branch June 11, 2023 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

False positive mixed_case_recommended_field for numeric values
3 participants