Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json_quote return different result #37294

Closed
Tracked by #36993
xiongjiwei opened this issue Aug 23, 2022 · 5 comments · Fixed by #53961
Closed
Tracked by #36993

json_quote return different result #37294

xiongjiwei opened this issue Aug 23, 2022 · 5 comments · Fixed by #53961
Assignees
Labels
component/json severity/minor sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@xiongjiwei
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

create table t (c varchar(20) collate utf8mb4_bin) charset utf8mb4;
insert into t values ('\\');
insert into t values (X'0C');
insert into t values ('"');
insert into t values ('\a');
insert into t values ('\b');
insert into t values ('\t');
insert into t values ('\n');
insert into t values ('\r');
insert into t values (X'10');
select json_quote(group_concat(c order by c, hex(c))) from t;

2. What did you expect to see? (Required)

"\b,\t,\n,\f,\r,\u0010,\",\\,a"

3. What did you see instead (Required)

"\b,\t,\n,\f,\r,\x10,\",\\,a"

4. What is your TiDB version? (Required)

@xiongjiwei xiongjiwei added type/bug The issue is confirmed as a bug. component/json severity/moderate labels Aug 23, 2022
@windtalker
Copy link
Contributor

minimum reproduce:

create table t (c varchar(20) collate utf8mb4_bin) charset utf8mb4;
insert into t values (X'10');
--- in TiDB
mysql> select json_quote(c) from t;
+---------------+
| json_quote(c) |
+---------------+
| "\x10"        |
+---------------+
1 row in set (0.00 sec)
--- in MySQL 8.0.25
mysql> select json_quote(c) from t;
+---------------+
| json_quote(c) |
+---------------+
| "\u0010"      |
+---------------+

@xiongjiwei
Copy link
Contributor Author

it only occurs if string hex < 0x20, MySQL will convert these characters to \u00xx, tidb is \xxx. I will change to minor

@xiongjiwei xiongjiwei mentioned this issue Aug 30, 2022
34 tasks
@jebter jebter added the sig/sql-infra SIG: SQL Infra label Aug 1, 2023
@dveeden
Copy link
Contributor

dveeden commented Jun 12, 2024

Another simple test case for this that doesn't reply on a table:

SELECT JSON_QUOTE(CONVERT(0x10 USING utf8mb4));

@dveeden
Copy link
Contributor

dveeden commented Jun 12, 2024

Another test:

WITH RECURSIVE nr(n) AS (
    SELECT 1 n
   UNION ALL
    SELECT n+1 FROM nr WHERE n<25
)
SELECT JSON_QUOTE(CONVERT(UNHEX(n) USING utf8mb4)) FROM nr;
mysql-8.0.11-TiDB-v8.2.0-alpha-313-gf35bab8191-dirty> WITH RECURSIVE nr(n) AS (SELECT 1 n UNION ALL SELECT n+1 FROM nr WHERE n<25) SELECT JSON_QUOTE(CONVERT(UNHEX(n) USING utf8mb4)) FROM nr;
+---------------------------------------------+
| JSON_QUOTE(CONVERT(UNHEX(n) USING utf8mb4)) |
+---------------------------------------------+
| "\x01"                                      |
| "\x02"                                      |
| "\x03"                                      |
| "\x04"                                      |
| "\x05"                                      |
| "\x06"                                      |
| "\a"                                        |
| "\b"                                        |
| "\t"                                        |
| "\x10"                                      |
| "\x11"                                      |
| "\x12"                                      |
| "\x13"                                      |
| "\x14"                                      |
| "\x15"                                      |
| "\x16"                                      |
| "\x17"                                      |
| "\x18"                                      |
| "\x19"                                      |
| " "                                         |
| "!"                                         |
| "\""                                        |
| "#"                                         |
| "$"                                         |
| "%"                                         |
+---------------------------------------------+
25 rows in set (0.00 sec)
mysql-8.4.0> WITH RECURSIVE nr(n) AS (SELECT 1 n UNION ALL SELECT n+1 FROM nr WHERE n<25) SELECT JSON_QUOTE(CONVERT(UNHEX(n) USING utf8mb4)) FROM nr;
+---------------------------------------------+
| JSON_QUOTE(CONVERT(UNHEX(n) USING utf8mb4)) |
+---------------------------------------------+
| "\u0001"                                    |
| "\u0002"                                    |
| "\u0003"                                    |
| "\u0004"                                    |
| "\u0005"                                    |
| "\u0006"                                    |
| "\u0007"                                    |
| "\b"                                        |
| "\t"                                        |
| "\u0010"                                    |
| "\u0011"                                    |
| "\u0012"                                    |
| "\u0013"                                    |
| "\u0014"                                    |
| "\u0015"                                    |
| "\u0016"                                    |
| "\u0017"                                    |
| "\u0018"                                    |
| "\u0019"                                    |
| " "                                         |
| "!"                                         |
| "\""                                        |
| "#"                                         |
| "$"                                         |
| "%"                                         |
+---------------------------------------------+
25 rows in set (0.00 sec)

@dveeden
Copy link
Contributor

dveeden commented Jun 12, 2024

Are we using strconv.Quote() instead of relying on encoding/json?

package main

import (
	"encoding/json"
	"fmt"
	"strconv"
)

func main() {
	for i := range 38 {
		if i == 0 {
			continue
		}
		j, _ := json.Marshal(string(i))
		fmt.Printf(
			"i=%d\tquote=%s\tj=%s\n",
			i,
			strconv.Quote(string(i)),
			j,
		)
	}
}
i=1	quote="\x01"	j="\u0001"
i=2	quote="\x02"	j="\u0002"
i=3	quote="\x03"	j="\u0003"
i=4	quote="\x04"	j="\u0004"
i=5	quote="\x05"	j="\u0005"
i=6	quote="\x06"	j="\u0006"
i=7	quote="\a"	j="\u0007"
i=8	quote="\b"	j="\b"
i=9	quote="\t"	j="\t"
i=10	quote="\n"	j="\n"
i=11	quote="\v"	j="\u000b"
i=12	quote="\f"	j="\f"
i=13	quote="\r"	j="\r"
i=14	quote="\x0e"	j="\u000e"
i=15	quote="\x0f"	j="\u000f"
i=16	quote="\x10"	j="\u0010"
i=17	quote="\x11"	j="\u0011"
i=18	quote="\x12"	j="\u0012"
i=19	quote="\x13"	j="\u0013"
i=20	quote="\x14"	j="\u0014"
i=21	quote="\x15"	j="\u0015"
i=22	quote="\x16"	j="\u0016"
i=23	quote="\x17"	j="\u0017"
i=24	quote="\x18"	j="\u0018"
i=25	quote="\x19"	j="\u0019"
i=26	quote="\x1a"	j="\u001a"
i=27	quote="\x1b"	j="\u001b"
i=28	quote="\x1c"	j="\u001c"
i=29	quote="\x1d"	j="\u001d"
i=30	quote="\x1e"	j="\u001e"
i=31	quote="\x1f"	j="\u001f"
i=32	quote=" "	j=" "
i=33	quote="!"	j="!"
i=34	quote="\""	j="\""
i=35	quote="#"	j="#"
i=36	quote="$"	j="$"
i=37	quote="%"	j="%"

https://go.dev/play/p/wwO4KFNnOIK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/json severity/minor sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants