-
-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge and simplify unescape #992
Conversation
I remember there was some issue with broken wide characters because of which we had to consider The third param in |
Ah I see. Yeah I hadn't considered that, my bad. Feel free to close as you see fit. |
Yes, let's drop it. |
@jarun Were you able to verify that this implementation doesn't work with any characters btw? I tested it with a wide glyph I have on my system and it seems to work fine: Seems like a good improvement to me to merge the two |
It breaks. And please give me a break. Run I really keep busy with different stuff and I need people to stop harping on the same string. |
OK. I understand you didn't know about the korean string we had for testing. |
I see, yeah I didn't know about the test string sorry. |
No problem! |
BTW I found an implementation that works for the Korean strings while still merging and simplifying the two Let me know if I should reopen the PR or if you have no interest in merging the two nevermind. |
I am thinking of the following detection logic: diff --git a/src/nnn.c b/src/nnn.c
index dde27b6..9ba8028 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -3417,7 +3417,7 @@ static void resetdircolor(int flags)
* Max supported str length: NAME_MAX;
*/
#ifndef NOLOCALE
-static wchar_t *unescape(const char *str, uint_t maxcols)
+static wchar_t *unescape_w(const char *str, uint_t maxcols)
{
wchar_t * const wbuf = (wchar_t *)g_buf;
wchar_t *buf = wbuf;
@@ -3457,19 +3457,23 @@ static wchar_t *unescape(const char *str, uint_t maxcols)
return wbuf;
}
-#else
+#endif
+
static char *unescape(const char *str, uint_t maxcols)
{
ssize_t len = (ssize_t)xstrsncpy(g_buf, str, maxcols);
--len;
- while (--len >= 0)
+ while (--len >= 0) {
+#ifndef NOLOCALE
+ if (g_buf[len] & 0x8)
+ return NULL;
+#endif
if (g_buf[len] <= '\x1f' || g_buf[len] == '\x7f')
g_buf[len] = '\?';
-
+ }
return g_buf;
}
-#endif
static off_t get_size(off_t size, off_t *pval, uint_t comp)
{
@@ -3754,7 +3758,10 @@ static void printent(const struct entry *ent, uint_t namecols, bool sel)
++namecols;
#ifndef NOLOCALE
- addwstr(unescape(ent->name, namecols));
+ if (unescape(ent->name, MIN(namecols, ent->nlen) + 1))
+ addstr(g_buf);
+ else
+ addwstr(unescape_w(ent->name, namecols));
#else
addstr(unescape(ent->name, MIN(namecols, ent->nlen) + 1));
#endif |
Hmm okay. luukvbaal@92e01d4 is my final implementation of how I would merge the two and I see no problems with it. Seems more elegant to me and Edit: the function for clarity #ifdef NOLOCALE
static char *unescape(const char *str, uint_t maxcols)
{
char * const wbuf = g_buf;
char *buf = wbuf;
xstrsncpy(wbuf, str, maxcols);
#else
static wchar_t *unescape(const char *str, uint_t maxcols)
{
wchar_t * const wbuf = (wchar_t *)g_buf;
wchar_t *buf = wbuf;
size_t len, lencount = maxcols + 1;
/* Convert multi-byte to wide char */
mbstowcs(wbuf, str, maxcols);
len = wcswidth(wbuf, --lencount);
if (len >= maxcols) {
/* Reduce wide chars one by one till it fits */
while (len > maxcols)
len = wcswidth(wbuf, --lencount);
wbuf[lencount] = L'\0';
}
#endif
while (*buf) {
if (*buf <= '\x1f' || *buf == '\x7f')
*buf = '\?';
++buf;
}
return wbuf;
} |
The detection (plus on-the-fly escaping) logic performs better on English-native systems but not on non-English-native systems. |
I think your final version is optimized enough given we consider |
Sure, made the PR #993. |
While considering to implement the suggestion in #935 (comment) for my fork and looking at
unescape()
I thought the while loop to reduce the number of wide chars to max columns seemed unnecessary.This merges
unescape()
for the default andNOLOCALE
case and adjusts the number wide chars tomaxcols
in the same call tombstowcs()
by providingmaxcols
instead ofNAME_MAX
as the character limit, avoiding the extra while loop.Also came across the issue mentioned in #989 while testing this. Without #989 applied
wbuf[maxcols] = '\0';
results in a segfault with icons enabled andmaxcols < 2
.