Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found a runtime compiler bug #39177

Closed
J-F-Liu opened this issue Jan 19, 2017 · 9 comments
Closed

Found a runtime compiler bug #39177

J-F-Liu opened this issue Jan 19, 2017 · 9 comments

Comments

@J-F-Liu
Copy link

J-F-Liu commented Jan 19, 2017

I am the author of pom, now I use pom to implement a PDF parser and get some surprising result:

git clone https://github.com/J-F-Liu/lopdf.git
cd lopdf/pdfutil
cargo build
cargo run -- decompress -i /home/junfeng/Downloads/OpenGEX.1.1.2.pdf

The output is:

▶ cargo run -- decompress -i /home/junfeng/Downloads/OpenGEX.1.1.2.pdf
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/pdfutil decompress -i /home/junfeng/Downloads/OpenGEX.1.1.2.pdf`
Open /home/junfeng/Downloads/OpenGEX.1.1.2.pdf
xref has 75 entires
...
xref has 75 entires
xref has 139815792274816 entires
xref has 75 entires
xref has 75 entires
xref has 139815791991872 entires
xref has 75 entires
xref has 75 entires
xref has 139815792057408 entires
xref has 75 entires
xref has 75 entires
xref has 139815792057408 entires
xref has 75 entires
xref has 139815792057696 entires
...
xref has 139815792057696 entires
xref has 75 entires
Do decompress

xref is the same BTreeMap variable captured in nested closures printing its len().
Notice these big random numbers, probably caused by wrong pointers.

Try another PDF file:

▶ cargo run -- decompress -i /home/junfeng/Downloads/rfc7159.pdf           
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/pdfutil decompress -i /home/junfeng/Downloads/rfc7159.pdf`
Open /home/junfeng/Downloads/rfc7159.pdf
xref has 234 entires
xref has 234 entires
xref has 234 entires
xref has 234 entires
xref has 234 entires
xref has 139647596457568 entires

This time the parser actually accesses entry in xref, the program failed and exited without printing any panic message!

The download links of these example PDF files are:

You can play with the code, add more debug infos, please find out the root cause.

@est31
Copy link
Member

est31 commented Jan 19, 2017

Link to valgrind run: http://paste.ubuntu.com/23828037/

Seems it errors with signal 11/SIGSEV/General Protection fault.

@J-F-Liu were you able to produce a more minimal example to reproduce the issue? This would be very helpful.

@J-F-Liu
Copy link
Author

J-F-Liu commented Jan 19, 2017

At home now, will try tomorrow.

@est31
Copy link
Member

est31 commented Jan 19, 2017

When I replace the two deps that use unsafe by shims, I still can reproduce: https://github.com/est31/lopdf/commit/2d16d23a81205b16f937a852339e4e25498e458f

This means only safe code is used, and means its a valid bug (of the compiler). One possible reason is the usage of some unstable and possibly buggy features in the pom crate, but that's only a guess.

@TimNN
Copy link
Contributor

TimNN commented Jan 19, 2017

Potentially a duplicate of #33197.

@est31
Copy link
Member

est31 commented Jan 19, 2017

If you add these two lines to the code:

fn stream(reader: &Reader) -> Parser<u8, Stream> {
	println!("READER addr is (oside): {:p}, {:x}", reader, reader.v); // ADDED
	dictionary() - space() - seq(b"stream") - eol() >>
	|dict: Dictionary| {
		println!("READER addr is (iside): {:p}, {:x}", reader, reader.v); //ADDED
		reader.print_xref_size();
		let length = dict.get("Length").and_then(|value| {
			if let Some(id) = value.as_reference() {
				return reader.get_object(id).and_then(|value|value.as_i64());
			}
			return value.as_i64();
		}).expect("Stream Length should be an integer.");
		let stream = take(length as usize) - eol().opt() - seq(b"endstream");
		stream.map(move |data|Stream::new(dict.clone(), data))
	}
}

With reader.v being an u32 member and being set to 0xdeadbeef and never changing, you get output:

READER addr is (oside): 0x7ffd7848b9c8, deadbeef
READER addr is (oside): 0x7ffd7848b9c8, deadbeef
READER addr is (oside): 0x7ffd7848b9c8, deadbeef
READER addr is (oside): 0x7ffd7848b9c8, deadbeef
READER addr is (oside): 0x7ffd7848b9c8, deadbeef
READER addr is (iside): 0x55ce8a486728, 89f352c0

So its less likely to be a bug with hashmaps. All types used as keys in BTreeMap seem to be innocent (String, (u32,u16), u32,...).

@nagisa
Copy link
Member

nagisa commented Jan 19, 2017

So this is the more minimal case I’ve constructed.

Running this program will result in two different pointers being printed. Looking at the weird_stuff_here its pretty clear the pointer gets put into closure by-ref, and then the closure outlives the stack frame it was created in, so somehow borrowck is not working here.

Moving the O: 'static, U: 'static, F: 'static bounds onto the trait, as opposed to the fn shr function, makes borrowck report the issue properly, so it seems like this is basically fixed when #37166 becomes a hard error.

EDIT: this code warns about soundness issue and also that it will be phased out in the future, but there’s a #[allow({lint})] that ignores these warnings.

@est31
Copy link
Member

est31 commented Jan 19, 2017

@est31
Copy link
Member

est31 commented Jan 19, 2017

so it seems like this is basically fixed when #37166 becomes a hard error.

so this is a dupe of #18937 then?

@nagisa
Copy link
Member

nagisa commented Jan 19, 2017

so this is a dupe of #18937 then?

Yes, although this issue is interesting in a way that it has an actual code that is able to exploit the issue to cause unsoundness. Should still be closed.

@arielb1 arielb1 closed this as completed Jan 19, 2017
J-F-Liu added a commit to J-F-Liu/lopdf that referenced this issue Jan 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants