Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test cases for truncating strings of equal length to the specificed length #106

Conversation

jbcpollak
Copy link

This PR creates a few unit tests to check situations where the string may be exactly equal length with the trunctation target, as reported in charmbracelet/lipgloss#324

The naive "equalascii" test seems to be relatively straight forward to fixe, but the "equalcontrolemoji" is trickier. The best I can come up with is when the target string length is reached, to scan the rest of the input string to see if the remainder is all control characters or has more printable characters.

While potentially slower, this scan would only need to be done once, at the the point were the truncation would begin, and could be short circuited if any printable characters are found. The slowest case would be when nothing but control characters remain.

I added an equalemoji which gets truncated - emojis and chinese characters are treated as double width and therefor truncated, even though they only take up one character space. I'm not sure if that is intentional or not.

@jbcpollak jbcpollak requested a review from aymanbagabas as a code owner June 29, 2024 04:18
@mikelorant
Copy link

I've also attempted to fix the Truncate function and have had no luck. I think the approach is fundamentally flawed because the ability to rewind the state is necessary depending on how many printable characters are part of the tail.

I'm nearly certain, but without hearing from @aymanbagabas I cant be sure, that we will need to rewrite this function to take a new approach:

  1. Scan the string and identify the location (bytes) that the printable length increases and by how much (either 1 or 2 width depending if it is an emoji etc).
  2. From this we can determine if the entire string can be printed otherwise where do we truncate to make sure the tail also fits.

The current function is very difficult to follow and I assume @jbcpollak has hit the same challenges I have in trying to figure out the logic. I'd highly recommend we extend the comments with an explanation of the logic as well as make sure each branching point is clearly commented.

@mikelorant
Copy link

@jbcpollak I'd also recommend we add test cases for plus 1 and minus 1. This should also be included for tails with more than 1 printable character (which means we need test cases for plus 2 and minus 2 if the tail is ..). Consider this the type of code that needs a lot of edge case handling, otherwise refactors are likely to break a lot more than they fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants