Skip to content

Suboptimal memory usage in PhpSpreadsheet\Reader\Xlsx->readColumnsAndRowsAttributes() #648

@tomi-heiskanen

Description

@tomi-heiskanen

This is: a bug report & suggested patch

What is the expected behavior?

Not to consume double amount memory which would be needed for processing.

What is the current behavior?

The memory usage is unnecessarily doubled as row&column attributes are first read into arrays and then set into the worksheet attributes.

What are the steps to reproduce?

  1. Attempt to read a very large Excel file (hundreds of columns and 1M+ rows)
  2. load() runs out of memory (unless gigabytes of memory is allowed for processing)

Which versions of PhpSpreadsheet and PHP are affected?

At least from 1.4.0 until current develop-branch.

Suggestion how to fix it

I could not create a pull request due to Github a permission issue, but please see patched version of the function which does not read everything into arrays first, but rather sets the attributes while processing. Patched Xlsx.php: https://gist.github.com/tomi-heiskanen/a1d0e3d376d1b019f6072eda33bd0c11

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementreader/xlsxReader for MS OfficeOpenXML-format (xlsx) spreadsheet filesstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions