-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fields containing only numbers should be JSON integers #3
Comments
I expect a serialization of ID like field values as JSON numbers to cause more trouble than it can cure, at least in the cases when it is possible, that the values might exceed 9007199254740991 or say 15+ digits in decimal at some future time or for some large installments. Some host languages ("JavaScript") do not handle such large numbers without further ado, and thus precision loss, staged parsing or even interoperability / round-trip problems MAY occur. On the other hand, the The general robust structuring with using only JSON objects, arrays, strings and booleans has some value in itself, hasn't it :-? Serializing a potentially infinite large identification number as string has the benefit, that whatever consumer or producer processing is involved, there is a good chance, that upon write and read, the id is preserved (whatever optimizations happen inside the local "processing" node). Performance MAY be an issue, suggesting to lean towards numbers where possible, but then the structures are built from many nested objects and only few arrays, thus speed optimized segmented parsing does not seem to be the primary design goal ;-) Note, that a migration from string(EPOC) towards ISO8601 or RFC3339 (subsetting ISO) would also move away from JSON number #4 So, maybe a detailed discussion could be helpful on these "value type serialization" topics. I would consider one exchange on the general modeling level (primary and secondary goals), and then - possibly guided by that outcome - other discussions focused on groups of attributes. |
At this date it seems by default MISP software (that is one backed by MySQL) uses 4 bytes for IDs (i.e. twice less than above noted limitation). Postgresql installs use Looking at these defaults it seems quite unlikely you would need anything larger than 8 byte ints in foreseeable future. Using numbers instead of the string means you can define more meaningful json schema, which is helpful for validation, as well as simplify serialization/deserialization process where you have to translate these numbers back and forth. I'm not concerned here about speed, but the hassle of implementing this. This is only somewhat related, but if it was described as a number in the JSON schema I wouldn't have ever be confused by #9 when implementing MISP format in my own app. |
To allow proper marshaling of the proposed misp-core-format JSON protocol it would be essential to have the attributes only containing numbers to be actual JSON Integers:
Exemplary for Events:
Probably also true for the different timestamps currently containing unix timestamp formated strings.
Same should be applied throughout the other parts of the RFC.
REF: #2
The text was updated successfully, but these errors were encountered: