Skip to content

CSV reader – backslash is always treated as an escape character #492

@rdarcy1

Description

@rdarcy1

This is:

- [x] a bug report
- [ ] a feature request

What is the expected behavior?

When required, the backslash should be treated as a normal character and not parsed as an escape character.

What is the current behavior?

The backslash acts as an escape character, so when at the end of a cell/field it escapes the closing quote and merges cells/lines.

What are the steps to reproduce?

<?php

require __DIR__ . '/vendor/autoload.php';

// CSV file with backslash at end of field
$contents = '"test field\"' . "\n" . '"another field"';
$filename = __DIR__ . '/test.csv';
file_put_contents($filename, $contents);


// Create a new reader
$reader = new PhpOffice\PhpSpreadsheet\Reader\Csv;

$spreadsheet = $reader->load($filename);

var_dump($spreadsheet->getActiveSheet()->getCell('A1')->getValue());

Output:

string(27) "test field\"
another field""

I know the backslash is commonly used as an escape character, but it would be good to at least have an option of turning this off, maybe with a $reader->setEscaping(false) method call. I'm currently dealing with third-party CSVs which contain backslashes (a poor design choice, but considering there's no universal standard I think it's perfectly valid).

RFC 4180 makes no mention of backslashes, using instead a quote character to escape quote characters.

Happy to put together a pull request if this change is deemed worthwhile.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions