Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json_extract_path and type/2 #3273

Closed
Hermanverschooten opened this issue Apr 8, 2020 · 20 comments
Closed

json_extract_path and type/2 #3273

Hermanverschooten opened this issue Apr 8, 2020 · 20 comments

Comments

@Hermanverschooten
Copy link

Hermanverschooten commented Apr 8, 2020

Environment

  • Elixir version (elixir -v): 1.10.2 (compile with Erlang/OTP 22)
  • Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): Postgresql 12
  • Ecto version (mix deps): 3.4.0
  • Database adapter and version (mix deps): postgrex 0.15.4
  • Operating system: OSX 10.15.3

Trying to use either form of json_extract_path with type/2 does not work as expected.

Current behavior

Repo.one(from t in Garage.Truck, where: t.id == 15, select: type(json_extract_path(t.extra, ["tacho"]), :string))
** (Ecto.Query.CompileError) the first argument of type/2 must be one of:

  * interpolations, such as ^value
  * fields, such as p.foo or field(p)
  * fragments, such fragment("foo(?)", value)
  * an arithmetic expression (+, -, *, /)
  * an aggregation or window expression (avg, count, min, max, sum, over, filter)

Got: json_extract_path(t.extra, ["tacho"])

    (ecto 3.4.0) expanding macro: Ecto.Query.select/3
    iex:6: (file)
    (ecto 3.4.0) expanding macro: Ecto.Query.from/2
    iex:6: (file)

and with the short form:

Repo.one(from t in Garage.Truck, where: t.id == 15, select: type(t.extra["tacho"], :string))
[debug] QUERY OK source="trucks" db=0.5ms queue=0.3ms idle=9434.3ms
SELECT (t0."extra"#>'{"tacho"}')::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15) []
"\"2020-11-23\""

Expected behavior

type/2 should either allow both forms or neither.

It is also kind of weird that if you omit the type/2 it returns a plain "2020-11-23", but with there are quotes within the quotes "\"2020-11-23\"", which means that if you use a custom ecto type it will receive the quoted string.

@wojtekmach
Copy link
Member

Thanks for the report, I'll fix it later today.

@josevalim
Copy link
Member

but with there are quotes within the quotes ""2020-11-23"", which means that if you use a custom ecto type it will receive the quoted string.

This is database behaviour though. When you say type(..., :string), you are actually telling the database to encode that json to string. Since the json is a string, you get a double quoted string. I think a lot of times the combination of type+json is not going to be what people expect, so I wonder if we should forbid mixing them altogether.

@Hermanverschooten
Copy link
Author

I understand that if I give :string it would stringify it, but If I give it to my custom decoder it too will receive the quoted string.

@wojtekmach
Copy link
Member

so I wonder if we should forbid mixing them altogether.

Yeah I think that's a good call, I believe to even get a decimal on PG we needed to p.meta["decimal"] |> type(:string) |> type(:decimal) so that is already not obvious and ugly.

@josevalim
Copy link
Member

josevalim commented Apr 8, 2020

@Hermanverschooten that's my point. You are stringifying a json type. Stringifying a json type is the equivalent to calling JSON.encode! on it, which means you are encoding the string to json. This is not Ecto semantics, this is all Postgres semantics.

@Hermanverschooten
Copy link
Author

Hermanverschooten commented Apr 8, 2020

@josevalim I know, but I only took this as an example, my issue was that I have a custom type that will take either a nil, an empty string or a string with a date and return a Date. If use this with a fragment it works, but it looks ugly, if I use it with the t.extra["tacho"], I would expect to receive on of those three values, not "", "\"\"" or "\"2020-11-23\"". Especially because If I do not use type/2 I do get one of three values.

@josevalim
Copy link
Member

What is the fragment variant that works? Which query does it emit?

@Hermanverschooten
Copy link
Author

Repo.one(from t in Garage.Truck, where: t.id == 15, select: type(fragment("?->'tacho'", t.extra), :string))
[debug] QUERY OK source="trucks" db=0.4ms idle=9853.8ms
SELECT t0."extra"->'tacho'::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15) []
"2020-11-23"

@Hermanverschooten
Copy link
Author

There is definitely something strange going on, if I do the queries in psql:

SELECT (t0."extra"#>'{"tacho"}')::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15);
   varchar
--------------
 "2020-11-23"
(1 row)

SELECT (t0."extra"->'tacho')::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15);
   varchar
--------------
 "2020-11-23"
(1 row)

They both yield the same result. But ecto interprets them differently.

@wojtekmach
Copy link
Member

I believe it's about the database after all:

iex(4)> Postgrex.query!(pid, ~s| select json_extract_path('{"a":"hello"}', 'a'); |, []).rows
[["hello"]]

iex(5)> Postgrex.query!(pid, ~s| select json_extract_path('{"a":"hello"}', 'a')::varchar; |, []).rows
[["\"hello\""]]

@Hermanverschooten
Copy link
Author

@wojtekmach Have you seen my queries above? They both return the same quoted string.

@wojtekmach
Copy link
Member

wojtekmach commented Apr 8, 2020

Ok, I tried reproducing them:

iex(5)> Postgrex.query!(pid, ~s|select '{"a":"hello"}'::json->'a'|, []).rows
[["hello"]]
iex(6)> Postgrex.query!(pid, ~s|select '{"a":"hello"}'::json->'a'::varchar|, []).rows
[["hello"]]

So it seems there's a difference in behaviour between -> and json_extract_path, we use the latter in Ecto.

@Hermanverschooten
Copy link
Author

There is a difference with () surrounding the fragment.

Repo.one(from t in Garage.Truck, where: t.id == 15, select: type(fragment("?->'tacho'", t.extra), TranMan.Ecto.MaybeDate))
[debug] QUERY OK source="trucks" db=1.4ms idle=9052.4ms
SELECT t0."extra"->'tacho'::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15) []
~D[2020-11-23]

Repo.one(from t in Garage.Truck, where: t.id == 15, select: type(fragment("(?->'tacho')", t.extra), TranMan.Ecto.MaybeDate))
[debug] QUERY OK source="trucks" db=0.6ms queue=0.5ms idle=9405.8ms
SELECT (t0."extra"->'tacho')::varchar FROM "trucks" AS t0 WHERE (t0."id" = 15) []
** (FunctionClauseError) no function clause matching in Ecto.Type.process_loaders/3

    The following arguments were given to Ecto.Type.process_loaders/3:

        # 1
        []

        # 2
        {:error, :invalid_format, "\"2020-11-23\""}

        # 3
        Ecto.Adapters.Postgres

    Attempted function clauses (showing 4 out of 4):

        defp process_loaders(_, :error, _adapter)
        defp process_loaders([fun | t], {:ok, value}, adapter) when is_function(fun)
        defp process_loaders([type | t], {:ok, value}, adapter)
        defp process_loaders([], {:ok, _} = acc, _adapter)

    (ecto 3.4.0) lib/ecto/type.ex:920: Ecto.Type.process_loaders/3
    (ecto 3.4.0) lib/ecto/repo/queryable.ex:378: Ecto.Repo.Queryable.process/4
    (ecto 3.4.0) lib/ecto/repo/queryable.ex:250: anonymous fn/3 in Ecto.Repo.Queryable.postprocessor/4
    (elixir 1.10.2) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
    (ecto 3.4.0) lib/ecto/repo/queryable.ex:198: Ecto.Repo.Queryable.execute/4
    (ecto 3.4.0) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
    (ecto 3.4.0) lib/ecto/repo/queryable.ex:112: Ecto.Repo.Queryable.one/3

@Hermanverschooten
Copy link
Author

It seems to be in postgrex:

Postgrex.query!(pid, ~s|select extra->'tacho'::varchar, (extra->'tacho')::varchar from trucks where id=15|, [])
%Postgrex.Result{
  columns: ["?column?", "varchar"],
  command: :select,
  connection_id: 3082,
  messages: [],
  num_rows: 1,
  rows: [["2020-11-23", "\"2020-11-23\""]]

@josevalim
Copy link
Member

It is not a postgrex issue, it is still PostgreSQL semantics. extra->'tacho'::varchar is the same as extra->('tacho'::varchar). So you are casting the input and not the output. If you cast the output, the json encoding will happen.

@Hermanverschooten
Copy link
Author

The thing is I am not, type/2 does this, maybe it should surround the fragment with (). Because what it is doing now is clearly wrong.

@josevalim
Copy link
Member

josevalim commented Apr 8, 2020

Type should not interfere with fragments. When you use fragments, it is your responsibility to ensure proper precedence and everything. If we automatically wrap it in parens, we can actually break code and make some constructs impossible, which defeats the purpose of fragments.

In any way, the point is:

  1. If you do not wrap the fragment in parens, then the casting is not doing anything
  2. If you wrap it in parens, then the casting will give the wrong result

There is no situation where the casting here can be helpful and that's why I think we should not allow type to be used with json access or json_extract_path. The problem you are trying to solve cannot be addressed with casting.

@Hermanverschooten
Copy link
Author

I do agree if you would allow the result to be passed to my custom ecto type, because that is what I am aiming for, not the casting at the database level, I would prefer it not to. If type/2 would just leave the fragment alone, then the postprocessing can still happen.

@Hermanverschooten
Copy link
Author

Or should we have another function to do that?

@josevalim
Copy link
Member

Or should we have another function to do that?

Exactly. type is about casting at the database level, which is why you can even do things like type(..., :varchar), and varchar is not an Ecto type at all.

The functionality that you need is not in Ecto at the moment. The simplest solution for now is to do a separate pass on the selected data and converting it then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants