Skip to content

Commit

Permalink
Binary-value options for Reader using flag settings
Browse files Browse the repository at this point in the history
  • Loading branch information
MarkBaker committed Oct 2, 2022
1 parent e35bbb9 commit 0dfeea8
Show file tree
Hide file tree
Showing 4 changed files with 191 additions and 6 deletions.
69 changes: 68 additions & 1 deletion docs/topics/reading-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,68 @@ Once you have created a reader object for the workbook that you want to
load, you have the opportunity to set additional options before
executing the `load()` method.

All of these options can be set by calling the appropriate methods against the Reader (as described below), but some options (those with only two possible values) can also be set through flags, either by calling the Reader's `setFlags()` method, or passing the flags as an argument in the call to `load()`.
Those options that can be set through flags are:

Option | Flag | Default
-------------------|------------------------------|---
Ignore Empty Cells | IReader::IGNORE_EMPTY_CELLS | Load empty cells
Read Data Only | IReader::READ_DATA_ONLY | Read data, structure and style
Include Charts | IReader::LOAD_WITH_CHARTS | Don't read charts

Several flags can be combined in a single call:
```php
$inputFileType = 'Xlsx';
$inputFileName = './sampleData/example1.xlsx';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Set additional flags before the call to load() */
$reader->setFlags(IReader::IGNORE_EMPTY_CELLS | IReader::LOAD_WITH_CHARTS);
$reader->load($inputFileName);
```
or
```php
$inputFileType = 'Xlsx';
$inputFileName = './sampleData/example1.xlsx';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Set additional flags in the call to load() */
$reader->load($inputFileName, IReader::IGNORE_EMPTY_CELLS | IReader::LOAD_WITH_CHARTS);
```

### Ignoring Empty Cells

Many Excel files have empty rows or columns at the end of a worksheet, which can't easily be seen when looking at the file in Excel (Try using Ctrl-End to see the last cell in a worksheet).
By default, PhpSpreadsheet will load these cells, because they are valid Excel values; but you may find that an apparently small spreadsheet requires a lot of memory for all those empty cells.
If you are running into memory issues with seemingly small files, you can tell PhpSpreadsheet not to load those empty cells using the `setReadEmptyCells()` method.

```php
$inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell's that contain actual content **/
$reader->setReadEmptyCells(false);
/** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName);
```

Note that cells containing formulae will still be loaded, even if that formula evaluates to a NULL or an empty string.
Similarly, Conditional Styling might also hide the value of a cell; but cells that contain Conditional Styling or Data Validation will always be loaded regardless of their value.

This option is available for the following formats:

Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:|
Xlsx | YES | Xls | YES | Xml | NO |
Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | NO | HTML | NO

This option is also available through flags.

### Reading Only Data from a Spreadsheet File

If you're only interested in the cell values in a workbook, but don't
Expand Down Expand Up @@ -210,6 +272,8 @@ Xlsx | YES | Xls | YES | Xml | YES |
Ods | YES | SYLK | NO | Gnumeric | YES |
CSV | NO | HTML | NO

This option is also available through flags.

### Reading Only Named WorkSheets from a File

If your workbook contains a number of worksheets, but you are only
Expand Down Expand Up @@ -642,7 +706,7 @@ Xlsx | NO | Xls | NO | Xml | NO |
Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | YES | HTML | NO

### A Brief Word about the Advanced Value Binder
## A Brief Word about the Advanced Value Binder

When loading data from a file that contains no formatting information,
such as a CSV file, then data is read either as strings or numbers
Expand Down Expand Up @@ -694,6 +758,9 @@ Xlsx | NO | Xls | NO | Xml | NO
Ods | NO | SYLK | NO | Gnumeric | NO
CSV | YES | HTML | YES

Note that you can also use the Binder to determine how PhpSpreadsheet identified datatypes for values when you set a cell value without explicitly setting a datatype.
Value Binders can also be used to set formatting for a cell appropriate to the value.

## Error Handling

Of course, you should always apply some error handling to your scripts
Expand Down
18 changes: 14 additions & 4 deletions src/PhpSpreadsheet/Reader/BaseReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -140,8 +140,14 @@ public function getSecurityScanner()
return $this->securityScanner;
}

protected function processFlags(int $flags): void
public function setFlags(int $flags): void
{
if (((bool) ($flags & self::IGNORE_EMPTY_CELLS)) === true) {
$this->setReadEmptyCells(false);
}
if (((bool) ($flags & self::READ_DATA_ONLY)) === true) {
$this->setReadDataOnly(true);
}
if (((bool) ($flags & self::LOAD_WITH_CHARTS)) === true) {
$this->setIncludeCharts(true);
}
Expand All @@ -155,13 +161,17 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
/**
* Loads Spreadsheet from file.
*
* @param int $flags the optional second parameter flags may be used to identify specific elements
* @param ?int $flags the optional second parameter flags may be used to identify specific elements
* that should be loaded, but which won't be loaded by default, using these values:
* IReader::IGNORE_EMPTY_CELLS - Don't create empty cells (those containing a null or an empty string)
* IReader::READ_DATA_ONLY - Only read data from the file, not structure or styling
* IReader::LOAD_WITH_CHARTS - Include any charts that are defined in the loaded file
*/
public function load(string $filename, int $flags = 0): Spreadsheet
public function load(string $filename, ?int $flags = null): Spreadsheet
{
$this->processFlags($flags);
if ($flags !== null) {
$this->setFlags($flags);
}

IOFactory::setLoading(true);

Expand Down
4 changes: 3 additions & 1 deletion src/PhpSpreadsheet/Reader/IReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@

interface IReader
{
public const LOAD_WITH_CHARTS = 1;
public const IGNORE_EMPTY_CELLS = 1;
public const READ_DATA_ONLY = 2;
public const LOAD_WITH_CHARTS = 4;

/**
* Can the current IReader read the file?
Expand Down
106 changes: 106 additions & 0 deletions tests/PhpSpreadsheetTests/Reader/ReaderFlagsTest.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
<?php

namespace PhpOffice\PhpSpreadsheetTests\Reader;

use PhpOffice\PhpSpreadsheet\Reader\IReader;
use PhpOffice\PhpSpreadsheet\Reader\Xlsx;
use PHPUnit\Framework\TestCase;

class ReaderFlagsTest extends TestCase
{
private const EMPTY_CELLS = 'Empty Cells';
private const DATA_ONLY = 'Data only';
private const WITH_CHARTS = 'with Charts';

/**
* @var Xlsx
*/
private $reader;

protected function setup(): void
{
$this->reader = new Xlsx();
}

/**
* @dataProvider flagsProvider
*/
public function testFlags(int $flags, array $settings): void
{
$this->reader->setFlags($flags);

self::assertSame($settings[self::EMPTY_CELLS], $this->reader->getReadEmptyCells());
self::assertSame($settings[self::DATA_ONLY], $this->reader->getReadDataOnly());
self::assertSame($settings[self::WITH_CHARTS], $this->reader->getIncludeCharts());
}

public function flagsProvider(): array
{
return [
[
0,
[
self::EMPTY_CELLS => true,
self::DATA_ONLY => false,
self::WITH_CHARTS => false,
],
],
[
IReader::IGNORE_EMPTY_CELLS,
[
self::EMPTY_CELLS => false,
self::DATA_ONLY => false,
self::WITH_CHARTS => false,
],
],
[
IReader::READ_DATA_ONLY,
[
self::EMPTY_CELLS => true,
self::DATA_ONLY => true,
self::WITH_CHARTS => false,
],
],
[
IReader::IGNORE_EMPTY_CELLS | IReader::READ_DATA_ONLY,
[
self::EMPTY_CELLS => false,
self::DATA_ONLY => true,
self::WITH_CHARTS => false,
],
],
[
IReader::LOAD_WITH_CHARTS,
[
self::EMPTY_CELLS => true,
self::DATA_ONLY => false,
self::WITH_CHARTS => true,
],
],
[
IReader::IGNORE_EMPTY_CELLS | IReader::LOAD_WITH_CHARTS,
[
self::EMPTY_CELLS => false,
self::DATA_ONLY => false,
self::WITH_CHARTS => true,
],
],
[
IReader::READ_DATA_ONLY | IReader::LOAD_WITH_CHARTS,
[
self::EMPTY_CELLS => true,
self::DATA_ONLY => true,
self::WITH_CHARTS => true,
],
],
[
IReader::IGNORE_EMPTY_CELLS | IReader::READ_DATA_ONLY | IReader::LOAD_WITH_CHARTS,
[
self::EMPTY_CELLS => false,
self::DATA_ONLY => true,
self::WITH_CHARTS => true,
],
],
];
}
}

0 comments on commit 0dfeea8

Please sign in to comment.