Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org).

### Added

- Add to all readers the option to allow or forbid fetching external images. This is unconditionally allowed now. The default will be set to "allow", so no code changes are necessary. However, we are giving consideration to changing the default. [PR #4543](https://github.com/PHPOffice/PhpSpreadsheet/pull/4543)
- Address Excel Inappropriate Number Format Substitution. [PR #4532](https://github.com/PHPOffice/PhpSpreadsheet/pull/4532)

### Removed
Expand Down
23 changes: 23 additions & 0 deletions docs/references/features-cross-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,17 @@
<td style="text-align: center;">N/A</td>
<td style="text-align: center; color: green;">✔</td>
</tr>
<tr>
<td style="padding-left: 1em;">Read Charts</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: green;">✔</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center;">N/A</td>
<td style="text-align: center;">N/A</td>
<td style="text-align: center; color: red;">✖</td>
</tr>
<tr>
<td style="padding-left: 1em;">Read Data Only (no formatting)</td>
<td style="text-align: center; color: green;">✔</td>
Expand Down Expand Up @@ -67,6 +78,17 @@
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: green;">✔</td>
</tr>
<tr>
<td style="padding-left: 1em;">Read External Images</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: green;">✔ <a href="#footnote9"><sup>9</sup></a></td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center; color: red;">✖</td>
<td style="text-align: center;">N/A</td>
<td style="text-align: center;">N/A</td>
<td style="text-align: center; color: green;">✔ <a href="#footnote9"><sup>9</sup></a></td>
</tr>
<tr>
<td style="padding-left: 0.5em;"><strong>Document Properties</strong></td>
<td style="text-align: center; color: orange;">●</td>
Expand Down Expand Up @@ -1006,6 +1028,7 @@
6. <span id="footnote6">There is very limited support for reading styles from an Ods spreadsheet. Writing styles has better support, although Number Format is incomplete.</span>
7. <span id="footnote7">In most cases, Html reader processes only inline styles; styles provided by Css classes may be ignored.</span>
8. <span id="footnote8">Code must [opt in](../topics/recipes.md#array-formulas) to array output.</span>
9. <span id="footnote9">Starting with release 4.5, code can allow or not external images. In release 4.5 (and in earlier releases which do not offer an option), default is to allow it.</span>

## Writers

Expand Down
29 changes: 17 additions & 12 deletions docs/topics/reading-and-writing-to-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -1114,7 +1114,10 @@ $reader->load("spreadsheetWithCharts.xlsx", $reader::LOAD_WITH_CHARTS);
If you wish to use the IOFactory `load()` method rather than instantiating a specific Reader, then you can still pass these flags.

```php
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load("spreadsheetWithCharts.xlsx", \PhpOffice\PhpSpreadsheet\Reader\IReader::LOAD_WITH_CHARTS);
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load(
"spreadsheetWithCharts.xlsx",
\PhpOffice\PhpSpreadsheet\Reader\IReader::LOAD_WITH_CHARTS
);
```

Flags that are available that can be passed to the Reader in this way include:
Expand All @@ -1123,17 +1126,19 @@ Flags that are available that can be passed to the Reader in this way include:
- $reader::READ_DATA_ONLY
- $reader::IGNORE_EMPTY_CELLS
- $reader::IGNORE_ROWS_WITH_NO_CELLS

| Readers | LOAD_WITH_CHARTS | READ_DATA_ONLY | IGNORE_EMPTY_CELLS | IGNORE_ROWS_WITH_NO_CELLS |
|----------|------------------|----------------|--------------------|---------------------------|
| Xlsx | YES | YES | YES | YES |
| Xls | NO | YES | YES | NO |
| Xml | NO | NO | NO | NO |
| Ods | NO | YES | NO | NO |
| Gnumeric | NO | YES | NO | NO |
| Html | N/A | N/A | N/A | N/A |
| Slk | N/A | NO | NO | NO |
| Csv | N/A | NO | NO | NO |
- $reader::ALLOW_EXTERNAL_IMAGES (starting with release 4.5)
- $reader::DONT_ALLOW_EXTERNAL_IMAGES (starting with release 4.5)

| Readers | LOAD<br>CHARTS | DATA<br>ONLY | IGNORE<br>EMPTY | IGNORE<br>ROWS | EXTERNAL<br>IMAGES |
|----------|--------|------|--------|--------|--------|
| Xlsx | YES | YES | YES | YES | YES |
| Xls | NO | YES | YES | NO | NO |
| Xml | NO | NO | NO | NO | NO |
| Ods | NO | YES | NO | NO | NO |
| Gnumeric | NO | YES | NO | NO | NO |
| Html | N/A | N/A | N/A | N/A | YES |
| Slk | N/A | NO | NO | NO | N/A |
| Csv | N/A | NO | NO | NO | N/A |

Likewise, when saving a file using a Writer, loaded charts will not be saved unless you explicitly tell the Writer to include them:

Expand Down
30 changes: 30 additions & 0 deletions src/PhpSpreadsheet/Reader/BaseReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,13 @@ abstract class BaseReader implements IReader
*/
protected bool $ignoreRowsWithNoCells = false;

/**
* Allow external images. Use with caution.
* Improper specification of these within a spreadsheet
* can subject the caller to security exploits.
*/
protected bool $allowExternalImages = true;

/**
* IReadFilter instance.
*/
Expand Down Expand Up @@ -149,6 +156,23 @@ public function setReadFilter(IReadFilter $readFilter): self
return $this;
}

/**
* Allow external images. Use with caution.
* Improper specification of these within a spreadsheet
* can subject the caller to security exploits.
*/
public function setAllowExternalImages(bool $allowExternalImages): self
{
$this->allowExternalImages = $allowExternalImages;

return $this;
}

public function getAllowExternalImages(): bool
{
return $this->allowExternalImages;
}

public function getSecurityScanner(): ?XmlScanner
{
return $this->securityScanner;
Expand Down Expand Up @@ -177,6 +201,12 @@ protected function processFlags(int $flags): void
if (((bool) ($flags & self::IGNORE_ROWS_WITH_NO_CELLS)) === true) {
$this->setIgnoreRowsWithNoCells(true);
}
if (((bool) ($flags & self::ALLOW_EXTERNAL_IMAGES)) === true) {
$this->setAllowExternalImages(true);
}
if (((bool) ($flags & self::DONT_ALLOW_EXTERNAL_IMAGES)) === true) {
$this->setAllowExternalImages(false);
}
}

protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
Expand Down
2 changes: 1 addition & 1 deletion src/PhpSpreadsheet/Reader/Html.php
Original file line number Diff line number Diff line change
Expand Up @@ -1129,7 +1129,7 @@ private function insertImage(Worksheet $sheet, string $column, int $row, array $
$name = $attributes['alt'] ?? null;

$drawing = new Drawing();
$drawing->setPath($src, false);
$drawing->setPath($src, false, allowExternal: $this->allowExternalImages);
if ($drawing->getPath() === '') {
return;
}
Expand Down
20 changes: 20 additions & 0 deletions src/PhpSpreadsheet/Reader/IReader.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@ interface IReader
*/
public const IGNORE_ROWS_WITH_NO_CELLS = 8;

/**
* Allow external images. Use with caution.
* Improper specification of these within a spreadsheet
* can subject the caller to security exploits.
*/
public const ALLOW_EXTERNAL_IMAGES = 16;
public const DONT_ALLOW_EXTERNAL_IMAGES = 32;

public function __construct();

/**
Expand Down Expand Up @@ -132,6 +140,15 @@ public function getReadFilter(): IReadFilter;
*/
public function setReadFilter(IReadFilter $readFilter): self;

/**
* Allow external images. Use with caution.
* Improper specification of these within a spreadsheet
* can subject the caller to security exploits.
*/
public function setAllowExternalImages(bool $allowExternalImages): self;

public function getAllowExternalImages(): bool;

/**
* Loads PhpSpreadsheet from file.
*
Expand All @@ -141,6 +158,9 @@ public function setReadFilter(IReadFilter $readFilter): self;
* self::READ_DATA_ONLY Read only data, not style or structure information, from the file
* self::IGNORE_EMPTY_CELLS Don't read empty cells (cells that contain a null value,
* empty string, or a string containing only whitespace characters)
* self::IGNORE_ROWS_WITH_NO_CELLS Don't load any rows that contain no cells.
* self::ALLOW_EXTERNAL_IMAGES Attempt to fetch images stored outside the spreadsheet.
* self::DONT_ALLOW_EXTERNAL_IMAGES Don't attempt to fetch images stored outside the spreadsheet.
*/
public function load(string $filename, int $flags = 0): Spreadsheet;
}
4 changes: 2 additions & 2 deletions src/PhpSpreadsheet/Reader/Xlsx.php
Original file line number Diff line number Diff line change
Expand Up @@ -1518,7 +1518,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
);
if (isset($images[$linkImageKey])) {
$url = str_replace('xl/drawings/', '', $images[$linkImageKey]);
$objDrawing->setPath($url, false);
$objDrawing->setPath($url, false, allowExternal: $this->allowExternalImages);
}
if ($objDrawing->getPath() === '') {
continue;
Expand Down Expand Up @@ -1622,7 +1622,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet
);
if (isset($images[$linkImageKey])) {
$url = str_replace('xl/drawings/', '', $images[$linkImageKey]);
$objDrawing->setPath($url, false);
$objDrawing->setPath($url, false, allowExternal: $this->allowExternalImages);
}
if ($objDrawing->getPath() === '') {
continue;
Expand Down
5 changes: 4 additions & 1 deletion src/PhpSpreadsheet/Worksheet/Drawing.php
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ public function getPath(): string
*
* @return $this
*/
public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip = null): static
public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip = null, bool $allowExternal = true): static
{
$this->isUrl = false;
if (preg_match('~^data:image/[a-z]+;base64,~', $path) === 1) {
Expand All @@ -107,6 +107,9 @@ public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip
if (!preg_match('/^(http|https|file|ftp|s3):/', $path)) {
throw new PhpSpreadsheetException('Invalid protocol for linked drawing');
}
if (!$allowExternal) {
return $this;
}
// Implicit that it is a URL, rather store info than running check above on value in other places.
$this->isUrl = true;
$ctx = null;
Expand Down
5 changes: 4 additions & 1 deletion tests/PhpSpreadsheetTests/Reader/Html/HtmlHelper.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,12 @@ public static function createHtml(string $html): string
return $filename;
}

public static function loadHtmlIntoSpreadsheet(string $filename, bool $unlink = false): Spreadsheet
public static function loadHtmlIntoSpreadsheet(string $filename, bool $unlink = false, ?bool $allowExternalImages = null): Spreadsheet
{
$html = new Html();
if ($allowExternalImages !== null) {
$html->setAllowExternalImages($allowExternalImages);
}
$spreadsheet = $html->load($filename);
if ($unlink) {
unlink($filename);
Expand Down
41 changes: 37 additions & 4 deletions tests/PhpSpreadsheetTests/Reader/Html/HtmlImage2Test.php
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

class HtmlImage2Test extends TestCase
{
public function testCanInsertImageGoodProtocol(): void
public function testCanInsertImageGoodProtocolAllowed(): void
{
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
self::markTestSkipped('Skipped due to setting of environment variable');
Expand All @@ -23,16 +23,32 @@ public function testCanInsertImageGoodProtocol(): void
</tr>
</table>';
$filename = HtmlHelper::createHtml($html);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, true);
$firstSheet = $spreadsheet->getSheet(0);

/** @var Drawing $drawing */
$drawing = $firstSheet->getDrawingCollection()[0];
self::assertEquals($imagePath, $drawing->getPath());
self::assertEquals('A1', $drawing->getCoordinates());
$spreadsheet->disconnectWorksheets();
}

public function testCantInsertImageNotFound(): void
public function testCanInsertImageGoodProtocolNotAllowed(): void
{
$imagePath = 'https://phpspreadsheet.readthedocs.io/en/stable/topics/images/01-03-filter-icon-1.png';
$html = '<table>
<tr>
<td><img src="' . $imagePath . '" alt="test image voilà"></td>
</tr>
</table>';
$filename = HtmlHelper::createHtml($html);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, false);
$firstSheet = $spreadsheet->getSheet(0);
self::assertCount(0, $firstSheet->getDrawingCollection());
$spreadsheet->disconnectWorksheets();
}

public function testCantInsertImageNotFoundAllowed(): void
{
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
self::markTestSkipped('Skipped due to setting of environment variable');
Expand All @@ -44,10 +60,27 @@ public function testCantInsertImageNotFound(): void
</tr>
</table>';
$filename = HtmlHelper::createHtml($html);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, true);
$firstSheet = $spreadsheet->getSheet(0);
$drawingCollection = $firstSheet->getDrawingCollection();
self::assertCount(0, $drawingCollection);
$spreadsheet->disconnectWorksheets();
}

public function testCantInsertImageNotFoundNotAllowed(): void
{
$imagePath = 'https://phpspreadsheet.readthedocs.io/en/latest/topics/images/xxx01-03-filter-icon-1.png';
$html = '<table>
<tr>
<td><img src="' . $imagePath . '" alt="test image voilà"></td>
</tr>
</table>';
$filename = HtmlHelper::createHtml($html);
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true, false);
$firstSheet = $spreadsheet->getSheet(0);
$drawingCollection = $firstSheet->getDrawingCollection();
self::assertCount(0, $drawingCollection);
$spreadsheet->disconnectWorksheets();
}

#[DataProvider('providerBadProtocol')]
Expand Down
32 changes: 30 additions & 2 deletions tests/PhpSpreadsheetTests/Reader/Xlsx/URLImageTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@

class URLImageTest extends TestCase
{
public function testURLImageSource(): void
public function testURLImageSourceAllowed(): void
{
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
self::markTestSkipped('Skipped due to setting of environment variable');
}
$filename = realpath(__DIR__ . '/../../../data/Reader/XLSX/urlImage.xlsx');
self::assertNotFalse($filename);
$reader = IOFactory::createReader('Xlsx');
$reader->setAllowExternalImages(true);
$spreadsheet = $reader->load($filename);
$worksheet = $spreadsheet->getActiveSheet();
$collection = $worksheet->getDrawingCollection();
Expand All @@ -37,14 +38,41 @@ public function testURLImageSource(): void
$spreadsheet->disconnectWorksheets();
}

public function testURLImageSourceNotFound(): void
public function testURLImageSourceNotAllowed(): void
{
$filename = realpath(__DIR__ . '/../../../data/Reader/XLSX/urlImage.xlsx');
self::assertNotFalse($filename);
$reader = IOFactory::createReader('Xlsx');
$reader->setAllowExternalImages(false);
$spreadsheet = $reader->load($filename);
$worksheet = $spreadsheet->getActiveSheet();
$collection = $worksheet->getDrawingCollection();
self::assertCount(0, $collection);
$spreadsheet->disconnectWorksheets();
}

public function testURLImageSourceNotFoundAllowed(): void
{
if (getenv('SKIP_URL_IMAGE_TEST') === '1') {
self::markTestSkipped('Skipped due to setting of environment variable');
}
$filename = realpath(__DIR__ . '/../../../data/Reader/XLSX/urlImage.notfound.xlsx');
self::assertNotFalse($filename);
$reader = IOFactory::createReader('Xlsx');
$reader->setAllowExternalImages(true);
$spreadsheet = $reader->load($filename);
$worksheet = $spreadsheet->getActiveSheet();
$collection = $worksheet->getDrawingCollection();
self::assertCount(0, $collection);
$spreadsheet->disconnectWorksheets();
}

public function testURLImageSourceNotFoundNotAllowed(): void
{
$filename = realpath(__DIR__ . '/../../../data/Reader/XLSX/urlImage.notfound.xlsx');
self::assertNotFalse($filename);
$reader = IOFactory::createReader('Xlsx');
$reader->setAllowExternalImages(false);
$spreadsheet = $reader->load($filename);
$worksheet = $spreadsheet->getActiveSheet();
$collection = $worksheet->getDrawingCollection();
Expand Down