Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError #253

Closed
greg-michael opened this issue Jan 18, 2022 · 10 comments
Closed

java.lang.OutOfMemoryError #253

greg-michael opened this issue Jan 18, 2022 · 10 comments
Assignees
Labels
discussion question or suggestion patch released

Comments

@greg-michael
Copy link

greg-michael commented Jan 18, 2022

We are running log4j2-scan.exe v2.7.1 on our Windows servers using TrueSight Server Automation (TSSA) to deploy and execute the package. This is done through a mapped-user elevation to a local-administrator account and executed via command shell as that user.

On a particular server, we're seeing this error from the output of the execution of the scanner:
Logpresso CVE-2021-44228 Vulnerability Scanner 2.7.1 (2022-01-02) (Time in agent's deploy log:: 01/17/2022 13:50:37)
Scanning drives: C:\, M:\ (without P:, Z:) (Time in agent's deploy log:: 01/17/2022 13:50:38)
Scanned 3191 directories and 27196 files
Found 0 vulnerable files
Found 0 potentially vulnerable files
Found 0 mitigated files
Completed in 10403.38 seconds
Error: Garbage-collected heap size exceeded.
java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
"C:\temp\stage\b197902652953cc29ef9df4465ff0232\bldeploycmd-2.bat": Item 'Execute log4j2-scan.exe' returned exit code -1 (Time in agent's deploy log:: 01/17/2022 16:44:03)
"C:\temp\stage\b197902652953cc29ef9df4465ff0232\bldeploycmd-2.bat": Command returned non-zero exit code: -1 (Time in agent's deploy log:: 01/17/2022 16:44:03)

The scanner is executed using this command string. Note that %RPTFILE% is defined prior to the execution of the scanner.
log4j2-scan.exe --silent --scan-zip --scan-log4j1 --all-drives --report-path "%RPTFILE%" --report-dir "C:\Temp" --exclude "P:" --exclude "Z:" --exclude-fs afs,cifs,autofs,tmpfs,devtmpfs,fuse.sshfs,iso9660 2>&1

This scan seems to take an inordinately long amount of time to run because it is getting stuck trying to scan more than 6M machine key files from an application called PortalProtect by TrendMicro. The file names are lengthy GUID-type names, and each file is roughyl 1k in size. There's no specific type other than "System File" which is generic. The file names are not patterned in any easily distinguishable way as to provide an easy exclusion filter.

When I ran the scan manually without the --silent option and included the --trace and --debug options, it got as far as this directory on the C: drive, output one single line of status update after 10 seconds and then basically hung itself.

Edit:
The process was slowly climbing up the ladder consuming all available memory. I had to kill it before it caused the server to run out of memory. It would appear that there is a need to include some regular flushing of memory to the log and then cycling through the next batch of X files and directories, especially when there are significant numbers of files to be processed. (i.e. - millions)

@xeraph xeraph self-assigned this Jan 18, 2022
@xeraph xeraph added the discussion question or suggestion label Jan 18, 2022
@xeraph
Copy link
Contributor

xeraph commented Jan 18, 2022

@greg-michael Due to Java API limitation, scanner cannot handle directory which has millions of files.

File[] files = f.listFiles();
if (files == null)
return;
for (File file : files) {
traverse(file, depth + 1);
}

As you see, listFiles() API tries to return all files in a single directory. This is a single function call, so scanner cannot have any chance to free up memory.

Since I changed minimum JDK version from 6 to 7, Files.walkFileTree API might be alternative solution. Until that patch is available, there is one workaround: -f file_path_list.txt. Write all file path list into file_path_list.txt and pass it to -f option.

@greg-michael
Copy link
Author

@greg-michael Due to Java API limitation, scanner cannot handle directory which has millions of files.

File[] files = f.listFiles();
if (files == null)
return;
for (File file : files) {
traverse(file, depth + 1);
}

As you see, listFiles() API tries to return all files in a single directory. This is a single function call, so scanner cannot have any chance to free up memory.

Since I changed minimum JDK version from 6 to 7, Files.walkFileTree API might be alternative solution. Until that patch is available, there is one workaround: -f file_path_list.txt. Write all file path list into file_path_list.txt and pass it to -f option.

Understood. We'll have to look into whether we just exclude that path or accept that the scanner cannot scan this system in its current version due to limitations within the Java code. Please let me know when you have an updated version, and I'll test it.

Thanks!

@xeraph
Copy link
Contributor

xeraph commented Jan 18, 2022

@greg-michael I will. Thank you for your understanding :D

@greg-michael
Copy link
Author

@xeraph Any updates on this?

@xeraph
Copy link
Contributor

xeraph commented Feb 2, 2022

@greg-michael Would you try v2.9.0 release? https://github.com/logpresso/CVE-2021-44228-Scanner/releases/tag/v2.9.0

@greg-michael
Copy link
Author

@greg-michael Would you try v2.9.0 release? https://github.com/logpresso/CVE-2021-44228-Scanner/releases/tag/v2.9.0

The new release seems to have helped for a number of systems. With the -Xmx option, what is the implication of setting a value for example -Xmx1000M? Would this mean that any files larger than 1GB would be skipped? Any directories/paths with sufficient number of files to exceed 1GB heap space would cause the scanner to abort? I'm trying to fine tune the scan executions in my Windows environment so that I can get a number of servers that are not finishing within a 5 hour window to be able to complete their scans. FWIW - the only servers that I'm still seeing occasional Out of Memory errors on are our Outlook Web Access servers. Those have to be scanned with -Xmx1000M or lower to be successfully scanned.

Thanks!

@xeraph
Copy link
Contributor

xeraph commented Feb 8, 2022

@greg-michael Wow. that's a good news. Xmx switch is supported by JVM or substrateVM, and it just set maximum available memory limit for java process. If scanner cannot allocate more memory above specified memory limit, it fails with OOM as usual. Therefore, if scanner completed scan without any error with Xmx1000M option, it means that scanner successfully scanned all files.

By the way, which one did you used? JAR version or native binary?

@greg-michael
Copy link
Author

I am using the binary with the -Xmx switch. There are still a small number of servers that continue to crash with OOM errors, but fewer than before. In particular, Outlook Web Access servers.

How does the use of the -Xmx option affect scanned file sizes? Does it impact them at all? I do have some systems with very large archive files and directories of archives, and I want to ensure that they are still being scanned properly.

@xeraph
Copy link
Contributor

xeraph commented Feb 11, 2022

@greg-michael Then try JAR version with Xmx switch for Outlook Web Access server.

Xmx limits heap size. In most cases, scanner can decompress part of large archive file in the limited heap size. If scanner exits normally, you can sure they are scanned properly. If you doubt it, test with large ZIP file which embedding vulnerable log4j file (scan zip file with --scan-zip option). You can specify target file path instead of directory path.

@greg-michael
Copy link
Author

Closing issue since the -Xmx switch seems to have helped for the majority of servers experiencing OOM issues during scans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion question or suggestion patch released
Projects
None yet
Development

No branches or pull requests

2 participants