Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PanasonicRawWbInfo2Descriptor: Infinite recursion extracting light source #419

Closed
alpire opened this issue Jul 22, 2019 · 3 comments · Fixed by #420
Closed

PanasonicRawWbInfo2Descriptor: Infinite recursion extracting light source #419

alpire opened this issue Jul 22, 2019 · 3 comments · Fixed by #420

Comments

@alpire
Copy link

alpire commented Jul 22, 2019

Extracting the light source from PanasonicRawWbInfo2 metadata can lead to an infinite recursion.

Stacktrace

Exception in thread "main" java.lang.StackOverflowError
        at java.base/java.util.HashMap.hash(HashMap.java:339)
        at java.base/java.util.HashMap.get(HashMap.java:552)
        at com.drew.metadata.Directory.getObject(Directory.java:1100)
        at com.drew.metadata.Directory.getInteger(Directory.java:462)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getWbTypeDescription(PanasonicRawWbInfo2Descriptor.java:65)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getDescription(PanasonicRawWbInfo2Descriptor.java:56)
        at com.drew.metadata.TagDescriptor.getLightSourceDescription(TagDescriptor.java:463)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getWbTypeDescription(PanasonicRawWbInfo2Descriptor.java:69)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getDescription(PanasonicRawWbInfo2Descriptor.java:56)
        at com.drew.metadata.TagDescriptor.getLightSourceDescription(TagDescriptor.java:463)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getWbTypeDescription(PanasonicRawWbInfo2Descriptor.java:69)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getDescription(PanasonicRawWbInfo2Descriptor.java:56)
        at com.drew.metadata.TagDescriptor.getLightSourceDescription(TagDescriptor.java:463)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getWbTypeDescription(PanasonicRawWbInfo2Descriptor.java:69)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getDescription(PanasonicRawWbInfo2Descriptor.java:56)
        at com.drew.metadata.TagDescriptor.getLightSourceDescription(TagDescriptor.java:463)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getWbTypeDescription(PanasonicRawWbInfo2Descriptor.java:69)
        at com.drew.metadata.exif.PanasonicRawWbInfo2Descriptor.getDescription(PanasonicRawWbInfo2Descriptor.java:56)
        at com.drew.metadata.TagDescriptor.getLightSourceDescription(TagDescriptor.java:463)
        ...

Steps to repro

You can reproduce the issue by downloading infinite-recursion.txt and running:

java -cp metadata-extractor-2.12.0.jar com.drew.imaging.ImageMetadataReader infinite-recursion.txt

The file was generated by fuzzing and is not a valid file format. I used a .txt extension so that github would allow me to upload it.

@Nadahar
Copy link
Contributor

Nadahar commented Jul 22, 2019

As far as I can see, this bug can be triggered quite easily with corrupted/invalid files. The infinite loop is very small, just 3 steps, but it's unclear to me exactly where the "logical flaw" is and thus where the fix should be done.

All it takes to achieve this is that the switch matches here:

switch (tagType) {
case TagWbType1:
case TagWbType2:
case TagWbType3:
case TagWbType4:
case TagWbType5:
case TagWbType6:
case TagWbType7:
return getWbTypeDescription(tagType);

..so that getWbTypeDescription(tagType) is called, that the following returns a value (not null):
Integer wbtype = _directory.getInteger(tagType);

..and that the returned value doesn't match anything in this switch:
switch (wbtype) {
case 0:
return "Unknown";
case 1:
return "Daylight";
case 2:
return "Fluorescent";
case 3:
return "Tungsten (Incandescent)";
case 4:
return "Flash";
case 9:
return "Fine Weather";
case 10:
return "Cloudy";
case 11:
return "Shade";
case 12:
return "Daylight Fluorescent"; // (D 5700 - 7100K)
case 13:
return "Day White Fluorescent"; // (N 4600 - 5500K)
case 14:
return "Cool White Fluorescent"; // (W 3800 - 4500K)
case 15:
return "White Fluorescent"; // (WW 3250 - 3800K)
case 16:
return "Warm White Fluorescent"; // (L 2600 - 3250K)
case 17:
return "Standard Light A";
case 18:
return "Standard Light B";
case 19:
return "Standard Light C";
case 20:
return "D55";
case 21:
return "D65";
case 22:
return "D75";
case 23:
return "D50";
case 24:
return "ISO Studio Tungsten";
case 255:
return "Other";
}

Without a match here, it defaults back to getDescription(wbtype) and it's rinse and repeat from there.

In the example file in this issue, the value is 5, but any value that's not explicitly in the switch would trigger the loop AFAICU.

@drewnoakes
Copy link
Owner

Thank you very much for this interesting file. It'd definitely be interesting to hear more about how you generated it.

Confirming that the bug is present in the dotnet implementation as well.

drewnoakes added a commit to drewnoakes/metadata-extractor-images that referenced this issue Jul 23, 2019
This file is not a valid image and was generated by fuzzing.
It triggers an infinite loop in PanasonicRawWbInfoDescriptor and related
code.
drewnoakes added a commit that referenced this issue Jul 23, 2019
drewnoakes added a commit to drewnoakes/metadata-extractor-dotnet that referenced this issue Jul 23, 2019
@alpire
Copy link
Author

alpire commented Jul 23, 2019

@drewnoakes: Glad you found the testcase interesting! It was actually generated automatically using coverage-guided fuzzing while we were testing Apache Tika. I haven't looked at metadata-extractor directly, yet :)

Fuzzing is the automated process of finding software bugs by feeding random data into a target program until one of those permutations reveals a flaw. It's been responsible for discovering a large number of security-critical issues found in operating systems, browsers, ...

One of the most important fuzzing advancement has been coverage guidance. A coverage-guided fuzzer gathers coverage information for each random input it tries. If a random input exercises new code, the fuzzer will keep it in a set of interesting inputs. The fuzzer will then generate new inputs by mutating those interesting inputs.

Coverage-guided fuzzing scales to complex programs. For instance, AFL, a coverage guided fuzzer, has been used to generate JPEG images out of thin air. By fuzzing a JPEG parser with a coverage-guided fuzzer, the fuzzer eventually generates valid JPEG images! The odds of that happening without coverage guidance is infinitesimally small.

Shameless plug: I work on Mayhem, the system that won the DARPA Cyber Grand Challenge. We're adapting our bug finding tech to Java and memory-safe languages. We think fuzz testing should be as fundamental as unit testing, and we're actively looking for partners to work with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants