-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure on attempt to serialize a pre-epoch timestamp received from an ext3 filesystem #464
Comments
...and I found another one:
(Yes, my folder structure for stuff bound for archival DVDs is an accreted mess that needs to be refactored.) UPDATE: It appears that this example only fails because of integer overflow on |
The relevant portion of the backtrace:
The code in question:
The issue is, there's no official way to get the underlying value out of This should probably be filed as an issue against the standard library; there needs to be some way to extract the underlying value of |
It should still be possible to work around it. I'll try to verify it later today, but, as I remember, the Specifically, the That's how an mtime of Serializing (In other words, the output is semantically equivalent to a That said, definitely a footgun in the standard library to be remedied. |
Yep, I figured that out after writing the previous comment, and started working on a patch to serde to fix it. My main concern is that So I think you'd have to do it without I think that you'll just have to use an |
Hmm. Good point. I have no problem with being clamped to a little over 292 billion years from the epoch for positive range, but I really would like to avoid this being another non-obvious location where someone who wants high reliability has to guard against failure. Would it be a breaking change to switch to |
I think it will always be a breaking change, as some formats may never support I also think that it might be a breaking change to change the type at all, since the On the other hand, it might be argued that you're replacing a very likely panic on serialization with a less likely properly returned error on deserialization only in the case of two different programs using different versions of Serde. I don't know if Serde has a stability policy for the types used for serializing Of course, one alternative is that you could just use a newtype wrapper and implement |
My main concern here is getting rid of the footgun if at all possible. I really don't want to have to maintain a special "Never allow these types to creep into structs I'm deriving Serialize/Deserialize on, because the compiler certainly won't warn you" audit list.
I've already spent more time trying to get comfortable with the Serde docs for customizing deserialization of complex data than on the rest of the project put together, so I think I'll just go for pragmatism over perfection, spend 5 minutes to manually convert Then I can move on to hacking up a quick implementation of my "Use |
I'm coming back to this with the intent to polish up my workaround into something that shouldn't require a change to my JSON schema if Serde fixes the problem and I remove the workaround and something struck me about this:
...it's conceptually equivalent to saying "We're going to use 32-bit (Except that this DoS is caused by things like If Serde has a stability policy that would preclude a fix to properly handle data that occurs in the wild in the name of ensuring output is deserializable with prior versions of Serde, that's a terrible policy. It's essentially saying "Future programs must remain vulnerable to DoS vulnerabilities from real-world input in order to harmonize with such vulnerabilities in old versions". Widening the range of acceptable values to align with in-the-wild input should never be blocked by stability guarantees on generated output... especially since such dates could already be generated in JSON from other serializers attempting to interoperate with Serde. As for |
This issue can be closed because it has been fixed in serde. use std::time::Duration;
use serde_json;
fn main() {
let t = std::time::UNIX_EPOCH - Duration::new(1, 0);
println!("{:?}", serde_json::to_string(&t));
}
There is no longer a panic. This can also be seen by the code in serde https://github.com/serde-rs/serde/blob/5b140361a31c21713e59fe4cc35ab1d192bbc79f/serde/src/ser/impls.rs#L611 changed in serde-rs/serde@a81968a . |
While that is an improvement, given that both file timestamps and JSON integers may be negative, and that most people probably never think to test with negative file timestamps, I think this is still too much of a footgun to be considered solved in a Rust library. The "run a slow batch job, watch it fail at 80% because Serde introduced a surprising artificial limitation that needs to be worked around with a newtype wrapper" bug is still present. |
While working on a tool to index files for one of my projects, I wound up getting a panic during the call to
serde_json
:I tracked it down to a specific file in my home media server. Apparently, the combination of Linux and the ext3 filesystem is perfectly capable of representing at least one date before the UNIX epoch.
ls -lh
:Dec 31 1969
os.stat
:st_mtime=-1
std::fs::metadata
:modified: Ok(SystemTime { tv_sec: -1, tv_nsec: 0 })
...and the irony is, SystemTime is the one piece of the Rust-provided metadata that I felt was already in a format suitable for a generic index file to be shared between programs written in different languages.
Well... that and it was a hassle to track down because Rust itself didn't complain and the panic message during serialization wouldn't tell me which file of the hundreds of thousands was causing it to die. For lack of a purpose-built tool, I had to manually bisect it until I narrowed it down.
The text was updated successfully, but these errors were encountered: