Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collapse of more than three dots into "padding dots" symbol cannot be disabled #16345

Open
codeofdusk opened this issue Mar 30, 2024 · 17 comments

Comments

@codeofdusk
Copy link
Contributor

Steps to reproduce:

  1. Set symbol level to "all".
  2. Read the following string with NVDA: ....

Actual behavior:

NVDA says "padding dots".

Expected behavior:

NVDA says "dot dot dot dot". This behaviour changed in #16141, and the string "padding dots" is very confusing without the PR context. This collapsing should be disabled at the "all" symbol level (as all symbols aren't reported) or at very least user configurable even if enabled by default.

NVDA logs, crash dumps and other attachments:

N/A

System configuration

N/A

Other questions

N/A

Context

See #16141 (comment)

@codeofdusk
Copy link
Contributor Author

CC @CyrilleB79.

@CyrilleB79
Copy link
Collaborator

NVDA says "dot dot dot dot".

Note that previously, NVDA said "4 dots" on my end, not "dot dot dot dot".

the string "padding dots" is very confusing without the PR context.

I agree that "padding" is making an hypothesis on the usage of these dots, which is not true in all cases. If we rename the string to something more semantically neutral such as "multiple dots", would it be better for you? Or do you need to know the exact number of dots?
Also, is the case of 4 dots a specific one? In which case, we may modify the regexp to capture at least 5 dots.

In #16141 (comment), you explain the original use case which has triggered the opening of this issue:

When reading any running paragraph of text containing more than three consecutive dots, at the end of Outlook notifications, etc.

Could you provide sample texts of these notifications or other cases where you have such dot sequence? Usual punctuation uses one or three dots, so all other cases are something more specific that should be exhibited here as concrete examples please. Thanks.

@hwf1324
Copy link
Contributor

hwf1324 commented Apr 2, 2024

To provide a special case, in Chinese the ellipsis is usually expressed as "......", but sometimes it is also expressed directly with six dots, for the example, "......".

@Adriani90
Copy link
Collaborator

Note that ellipsis has a lot of ways to be expressed, chinese should not be honored here. Either we go all way and implement all ellipsis pronounciations in the symbol.dic files, or we let the users to interpret where an ellipsis should apply based on the context of the reading.
https://en.wikipedia.org/wiki/Ellipsis

Voice over can indeed count the symbols if there are more than x symbols next to each other. I am quite sure there must be a pythonic way to do this by traversing the string or so and counting the characters if they are the same, and it could be an optional setting in general settings category. For example, if more than 2 dots or dashes or underlines are next to each other, NVDA reports the number of these symbols instead of every of them.
In this way we wouldn't honor any ellipsis pronounciation and users would interpret themselves whether 3 dots or 6 dots should be understood as ellipsis.
Note that in some books for example ellipsis is also expressed via dashes.

@Adriani90
Copy link
Collaborator

@codeofdusk instead of reverting the padding symbols all together, would it not be better to find a pythonic way of counting characters and revert it afterwards? I think there is still the possibility to hear all symbols if you change the symbol level, so the ellipsis thing is not really urgent to solve that fast. Or am I wrong?
Anyway, it is really annoying to hear all those dots in pdf files.

@CyrilleB79
Copy link
Collaborator

@codeofdusk, I can see that you have opened #16364.

I am not against reverting #16345 if it causes issues.

But in #16364, you are describing the Chinese use case as the reason to revert.
If only the 6-dot Chinese pattern was an issue, there is a solution which is to add this 6-dot pattern as a complex symbol, as 3-dot pattern already is.

But I seem to have understood that the "padding dots" modification was very annoying for you and that Chinese was not your main use case.
So please could you answer to #16345 (comment) and describe with more details your use case and what is annoying for you?

@CyrilleB79
Copy link
Collaborator

@codeofdusk have you seen #16345 (comment)? More specifically, can you answer the following questions?

If we rename the string to something more semantically neutral such as "multiple dots", would it be better for you? Or do you need to know the exact number of dots?

Thanks.

@Adriani90
Copy link
Collaborator

Just to note, hash signs are already counted in NVDA if there are more than 3 signs consecutively, so maybe we can use that logic here as well.

@Adriani90
Copy link
Collaborator

And also for dashes and underlines.

@CyrilleB79
Copy link
Collaborator

@Adriani90 and all:
To clarify, in NVDA 2024.1, any symbol repeated more than 3 times is reported with its number of repetition. Due to #16141, the number of dots is not reported anymore, but the group of dots is reported instead as "padding dots".

It's clear that the expression "padding dots" is not suitable since it does not cover all use cases. I have proposed to replace it by "multiple dots". If only this change needs to be made, it is very easy to do it in the symbols.dic file.

But if you (@codeofdusk, @Adriani90) or anyone else want to continue having the number of dots being reported, it cannot be managed only be a modification in symbols.dic; some Python code needs to be written specifically; and I am a bit concerned to have to write Python code just to handle one specific symbol. Hence my question to know if having the number of dots reported is important or not for you (and all users).
Thanks.

@CyrilleB79
Copy link
Collaborator

@Adriani90 wrote:

And also for dashes and underlines.

By the way, isn't dash and underline characters repetition reported on your end? Provided you have the punctuation level high enough of course.

@lukaszgo1
Copy link
Contributor

Given that we report number of occurrences of each symbol other than dot, I'd say we should try to be consistent and do the same for dot. Are multiple dots in a row present in tables of content really as common, as to justify breaking the rule? Can this case be solved in a different way?

@Adriani90
Copy link
Collaborator

So I tested now with NVDA 2023.3.4 again, and it seems indeed the number of dots were reported as expected when symbols were next to each other. Contrary to what I though that every single dot is reported. Sorry for not having tested this in more details.
However, I studied abit further the root problem,
We need dots to be reported at low symbols level because they are part of number separators and elipsis, for example when indicating chapters or using elipsis for expressions.
But we don't need them reported at all at low symbol levels if there are more than 6 dots in a row because there is no use case for more than 6 dots in a row in usual reading. The elipsis issue could be solved by reporting a maximum of 6 dots at symbol levels "none" or "some", and subpress totally the rerpoting of 7 dots or more in these symbol levels.
Practical comparison:

Using dashes with symbol level "some":
2--------2-1
Using underlines:
2________2_1
Using dots:
2........2.1
1. Introduction ................................ 21
2. Chapter 1 ................................ 4
3. Chapter 2 ................................ 10

Ideal behavior:

  1. NVDA counts 6 dots maximum, and subpresses reporting of more than 6 dots completely when symbol level is some or none.
  2. NVDA reports the number of dots like in 2023.3.4 when symbol level is set to most or all.

This would be more consistent with e.g. dashes and would still be good enough to solve #15845.

@XLTechie
Copy link
Collaborator

XLTechie commented May 7, 2024 via email

@Adriani90
Copy link
Collaborator

For three or six dots, we should report elipsis I think, at Most or all symbol level..

I disagree with this particular elipsis reporting. Elipsis can be expressed also via other sighn, not only dots. This should remain at interpretation of user whether it is contextually elipsis or not when reading. The number of dots is information enough to deduce the elipsis.

When reviewing by word, the current 2024.1 behavior should be preserved--report number of dots.

Did you mean 2023.4? In 2024.1 padding dots is reported, this should not be preserved, neither for word review. The number of dots should be reprted for word review as well.

@XLTechie
Copy link
Collaborator

XLTechie commented May 9, 2024

I wrote:

When reviewing by word, the current 2024.1 behavior should be preserved--report number of dots.

@Adriani90 responded:

Did you mean 2023.4? In 2024.1 padding dots is reported, this should not be preserved, neither for word review.

No, I meant 2024.1. Padding dots is introduced in 2024.2 (so is unreleased) per the changelog, and also as easily tested with my copy of 2024.1RC1.

@CyrilleB79 CyrilleB79 added this to the 2024.2 milestone May 23, 2024
@CyrilleB79
Copy link
Collaborator

Adding to 2024.2 milestone since it impact a new feature from this dev cycle.

NV Access (@seanbudd or others), feel free to remove from the milestone if you decide it's not important.

seanbudd pushed a commit that referenced this issue May 23, 2024
Related to #16345.

Summary of the issue:
Multiple dots are reported as "padding dots" in situations where these dots have no padding function. "padding" is too restrictive and is also more difficult to understand; by the way, some translators have actually translated "padding dots" to "multiple dots" in their translations.

Description of user facing changes
Multiple dots (4 or more) will now be reported with the more neutral "multiple dots" instead of "padding dots" when the symbol level is high enough.

Description of development approach
Changed both the symbol name and what is reported in symbol file.
@seanbudd seanbudd removed this from the 2024.2 milestone May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants