Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support user-defined mapping for Inf and NaN via keyword arg #294

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
27 changes: 22 additions & 5 deletions src/read.jl
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,13 @@ end

const FLOAT_INT_BOUND = 2.0^53

function read!(buf, pos, len, b, tape, tapeidx, ::Type{Any}, checkint=true; allow_inf::Bool=false)
function read!(buf, pos, len, b, tape, tapeidx, ::Type{Any}, checkint=true; inf_mapping::Union{Function,Nothing}=nothing, allow_inf::Bool=(inf_mapping !== nothing))
if b == UInt8('{')
return read!(buf, pos, len, b, tape, tapeidx, Object, checkint; allow_inf=allow_inf)
return read!(buf, pos, len, b, tape, tapeidx, Object, checkint; allow_inf=allow_inf, inf_mapping=inf_mapping)
elseif b == UInt8('[')
return read!(buf, pos, len, b, tape, tapeidx, Array, checkint; allow_inf=allow_inf)
return read!(buf, pos, len, b, tape, tapeidx, Array, checkint; allow_inf=allow_inf, inf_mapping=inf_mapping)
elseif b == UInt8('"')
return read!(buf, pos, len, b, tape, tapeidx, String)
return read!(buf, pos, len, b, tape, tapeidx, String; inf_mapping=inf_mapping)
elseif b == UInt8('n')
return read!(buf, pos, len, b, tape, tapeidx, Nothing)
elseif b == UInt8('t')
Expand Down Expand Up @@ -148,7 +148,7 @@ function read!(buf, pos, len, b, tape, tapeidx, ::Type{Any}, checkint=true; allo
invalid(InvalidChar, buf, pos, Any)
end

function read!(buf, pos, len, b, tape, tapeidx, ::Type{String})
function read!(buf, pos, len, b, tape, tapeidx, ::Type{String}; inf_mapping::Union{Function,Nothing}=nothing)
pos += 1
@eof
strpos = pos
Expand All @@ -171,6 +171,23 @@ function read!(buf, pos, len, b, tape, tapeidx, ::Type{String})
b = getbyte(buf, pos)
end
@check
if inf_mapping !== nothing
val = view(buf, strpos:pos-1)
float = if val == codeunits(inf_mapping(Inf))[2:end-1]
Inf
elseif val == codeunits(inf_mapping(-Inf))[2:end-1]
-Inf
elseif val == codeunits(inf_mapping(NaN))[2:end-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usage makes me think that inf_mapping should be a Tuple or NamedTuple rather than a function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so, too. But the function version was much faster.

Copy link
Author

@hhaensel hhaensel Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried arrays, tuples and functions, at least concerning writing. I didn't check read performance.

Copy link
Author

@hhaensel hhaensel Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also checked the RawType approach but I couldn't find out how to change the type to Float. The current approach looks more natural to me and has less code.
It is somewhat of a restriction that I only support the case of string mappings, but I think it is very untypical that people want to cover other values than Infinity and NaN if they have a process that allows to send non-standard JSON.
EDIT: it's easy to include the quotes just by expanding the view and leaving out the [2:end-1] so my previous comment is no longer valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between function and tuple that is giving you that performance difference is methods are specialized on a function but not on a tuple's value. You could get similar performance with a tuple by lifting it to the type domain with Val. I do see the advantage in terms of runtime performance of having the serialization format of inf and nan be passed into write/read at the type level

NaN
else
0.0
end
if float != 0.0
@inbounds tape[tapeidx] = FLOAT
@inbounds tape[tapeidx+1] = Core.bitcast(UInt64, float)
return pos + 1, tapeidx + 2
end
end
@inbounds tape[tapeidx] = string(strlen)
@inbounds tape[tapeidx+1] = ifelse(escaped, ESCAPE_BIT | strpos, strpos)
return pos + 1, tapeidx + 2
Expand Down
30 changes: 21 additions & 9 deletions src/write.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ Write JSON.
## Keyword Args

* `allow_inf`: Allow writing of `Inf` and `NaN` values (not part of the JSON standard). [default `false`]
* `inf_mapping`: A function to map `Inf`, `-Inf` and `NaN` values to a custom representation. [default `nothing`]

if `inf_mapping` is `nothing` the mapping is equivalent to
`inf_mapping = x -> x == Inf ? "Infinity" : x == -Inf ? "-Infinity" : "NaN"`.
Specifying `inf_mapping` will automatically set the default value of `allow_inf` to `true`.
* `dateformat`: A [`DateFormat`](https://docs.julialang.org/en/v1/stdlib/Dates/#Dates.DateFormat) describing how to format `Date`s in the object. [default `Dates.default_format(T)`]
"""
function write(io::IO, obj::T; kw...) where {T}
Expand Down Expand Up @@ -279,19 +284,26 @@ function write(::NumberType, buf, pos, len, x::AbstractFloat; allow_inf::Bool=fa
return buf, pos, len
end

@inline function write(::NumberType, buf, pos, len, x::T; allow_inf::Bool=false, kw...) where {T <: Base.IEEEFloat}
isfinite(x) || allow_inf || error("$x not allowed to be written in JSON spec")
if isinf(x)
@inline function write(::NumberType, buf, pos, len, x::T; inf_mapping::Union{Function, Nothing} = nothing, allow_inf::Bool = inf_mapping !== nothing, kw...) where {T <: Base.IEEEFloat}
if isfinite(x) || (allow_inf && inf_mapping === nothing && isnan(x))
@check Ryu.neededdigits(T)
pos = Ryu.writeshortest(buf, pos, x)
else
allow_inf || error("$x not allowed to be written in JSON spec")
# Although this is non-standard JSON, "Infinity" is commonly used.
# See https://docs.python.org/3/library/json.html#infinite-and-nan-number-values.
if sign(x) == -1
@writechar '-'
if inf_mapping === nothing
sign(x) == -1 && @writechar '-'
@writechar 'I' 'n' 'f' 'i' 'n' 'i' 't' 'y'
else
bytes = codeunits(inf_mapping(x))
@check length(bytes)
for b in bytes
@inbounds buf[pos] = b
pos += 1
end
end
@writechar 'I' 'n' 'f' 'i' 'n' 'i' 't' 'y'
return buf, pos, len
end
@check Ryu.neededdigits(T)
pos = Ryu.writeshortest(buf, pos, x)
return buf, pos, len
end

Expand Down
5 changes: 5 additions & 0 deletions test/json.jl
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@ end
@test JSON3.read("Inf"; allow_inf=true) === Inf
@test JSON3.read("Infinity"; allow_inf=true) === Inf
@test JSON3.read("-Infinity"; allow_inf=true) === -Inf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that it is not possible to support reading with a custom mapping defined with a function that maps Float64 to String.

However, it is possible to support reading and writing with a @NamedTuple{positive_inf::AbstractString, negative_inf::AbstractString, nan::AbstractString} API.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that not being able to read the JSON back in when a custom mapping is used is a bummer, but also it's something that is/will be better solved in JSONBase.jl, where it's easier to override reading things.

Actually, we do have the RawJson construct if you really needed to parse something back in; it's a bit of a heavy-handed escape hatch here, but technically would work.


quoted_inf_mapping(x) = x == Inf ? "\"Infinity\"" : x == -Inf ? "\"-Infinity\"" : "\"NaN\""
@test JSON3.write(NaN, inf_mapping = quoted_inf_mapping) == "\"NaN\""
@test JSON3.write(Inf, inf_mapping = quoted_inf_mapping) == "\"Infinity\""
@test JSON3.write(-Inf, inf_mapping = quoted_inf_mapping) == "\"-Infinity\""
end

@testset "Char" begin
Expand Down