Skip to content

Commit

Permalink
Integrate quote
Browse files Browse the repository at this point in the history
quote is the feature branch for converting an array of argument strings
into a properly quoted command line string

Adds control character escaping and Unicode support.
quote() passes through Unicode characters, escapes all others
quote_ascii() escapes all, including Unicode

Author-Rebase-Consent: https://No-rebase.github.io
  • Loading branch information
drok committed Oct 31, 2023
2 parents 7d3b192 + ce839a6 commit 6e9606b
Show file tree
Hide file tree
Showing 4 changed files with 361 additions and 42 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
# acorn-node change log

All notable changes to this project will be documented in this file.
Starting at release 2.0.0, the project will be strictly Semantically Versioned.

## 2.0.0 2023-10-31
* This is a compatibility breaking change due to behaviour change of quote()
* `quote` fixes picked up from all forks in the network (if I missed any, submit issue)
* backslashes no longer doubled (@drok/Radu Hociung)
* Fix - bash Brace Expansion is correctly quoted (@drok, @iFixit/Daniel Beardsley)
* Major change - glob ops correctly passed through (@emosbaugh/Ethan Mosbaugh)
* Fix - quoting exclamation marks near single-quotes (@raxod502/Radon Rosborough)
* Fix - correctly quote empty string (@cspotcode/Andrew Bradley)
* Major change - Double quotes are no longer used as escape, only single quotes. The output should contain
the minumum necessary quoting, and be eyeball-friendly, as well as shell-grammar compliant. (@drok)
* Major change - control characters are escaped as \E, \t, or \x.., no longer passed through literally (@drok)
* New - `quote_ascii`: works like quote but also escapes Unicode characters as \uHHHH (@drok)

## 1.8.1 2023-10-29
- @Mergesium started a new fork focused on quality. Package will be scoped @mergesium/shell-quote.
Expand Down
52 changes: 46 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,62 @@

Parse and quote shell commands.

# example
# Example usage

## quote
The `quote` function is used to convert an array of N arbitrary strings into a correctly escaped / quoted string that a Posix shell would parse as the same N individual words. Unicode characters are preserved for readability, but not all terminal emulators are able to take Unicode text cut and pasted. To escape the Unicode to \uXXXX sequences, use the `quote_ascii` variant.

The output of `quote` is [POSIX shell compliant](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html), and should also be compatible with flavours like bash, ksh, etc. However, Windows CMD is not supported because it interprets the single quote as a literal rather than a quoting character. Commands which are intended to run on windows will have the incorrect quoting, and likely need manual adjustment to make them correct.
### Example (open in [Codepen](https://codepen.io/drok-the-scripter/pen/WNPxjaE?editors=0011)):
``` js
var quote = require('@mergesium/shell-quote/quote');
var s = quote([ 'a', 'b c d', '$f', '"g"' ]);
console.log(s);
var { quote } = require('@mergesium/shell-quote/quote');

var user_input = "|-|ello world! | am Bobby⇥les; I have * your files now.";
var cmd_pipe = [
'echo', '-n', user_input,
{op: '|'},
'grep', "-qi", "hello",
{op: '||'},
'echo', "Rude New User"];

console.log(quote(cmd_pipe));
// No, Bobby, you don't have *
```

output
### Output:

```
a 'b c d' \$f '"g"'
echo -n '|-|ello world! | am Bobby⇥les; I have * your files now.' | grep -qi hello || echo 'Rude New User'
```
Ie. if pasted into a terminal window, the shell would `echo` the user input to the stdin of `grep` and output 'Rude New User' if the word 'hello' is not found:

All characters which have a special meaning to the shell (eg, <, >, `, $, #, etc) are quoted appropriately, so the output string can be cut-and-pasted into a terminal window.

The example above, demostrates how our old friend [Bobby Tables](https://xkcd.com/327/) provided some input which is quoted/sanitized as to thwart Bobby's intended exploit.
## quote_ascii
Same as quote, except it also escapes all Unicode characters to \uXXXX sequences.In the example, the `` is the only Unicode character, and it is escaped too.

### Example:
``` js
const { quote_ascii } = require('@mergesium/shell-quote/quote');

var user_input = "|-|ello world! | am Bobby⇥les; I have * your files now.";
var cmd_pipe = [
'echo', '-n', user_input,
{op: '|'},
'grep', "-qi", "hello",
{op: '||'},
'echo', "Rude New User"];

console.log(quote_ascii(cmd_pipe));
// No, Bobby, you don't have *
```

### Output:

```
echo -n '|-|ello world! | am Bobby'\\u21e5'les; I have * your files now.' | grep -qi hello || echo 'Rude New User'
```
## parse

``` js
Expand Down
142 changes: 131 additions & 11 deletions quote.js
Original file line number Diff line number Diff line change
@@ -1,16 +1,136 @@
'use strict';

module.exports = function quote(xs) {
return xs.map(function (s) {
if (s && typeof s === 'object') {
return s.op.replace(/(.)/g, '\\$1');
}
if ((/["\s]/).test(s) && !(/'/).test(s)) {
return "'" + s.replace(/(['\\])/g, '\\$1') + "'";
// quote - Quotes an array of strings to a strict ASCII representation.
// Escapes all characters that have special meaning to a shell.
// Escapes all control characters (0x01-0x1f) to \x?? sequences
// Drops null characters (0x00), as shells do.
// Preserves the meaning of operator objects like {op:'glob'} or
// {op: '|'}
// Preserves literal Unicode characters
// Useful when its output will be read in a Unicode capable environment,
// like an editor, an email, a web page or a UTF-8 capable terminal
// emulator
function quote(xs) {
return xs.map(function (s) {
if (s && typeof s === 'object') {
if (s.op === 'glob') {
return s.pattern;
}
return s.op;
}
else {
s = String(s).replace(/\x00/g, ''); // eat \x00 just like the shell.
if(s === '') {
return "''";
}
// Enclose strings with metacharacters in single quoted,
// and escape any single quotes.
// Match strictly to avoid escaping things that don't need to be.
// bash: | & ; ( ) < > space tab
// Also escapes bash curly brace ranges {a..b} {a..z..3} {1..20} {a,b} but not
// {a...b} or {..a}
else if ((/(?:["\\$`! |&;\(\)<>#]|{[\d]+\.{2}[\d]+(?:\.\.\d+)?}|{[a-zA-Z].{2}[a-zA-Z](?:\.\.\d+)?}|{[^{]*,[^}]*})/m).test(s)) {
// If input contains outer single quote, escape each of them individually.
// eg. 'a b c' -> \''a b c'\'
var outer_quotes = s.match(/^('*)(.*?)('*)$/s);

var inner_string = outer_quotes[2].replace(/'/g, '\'\\\'\'');

// the starting outer quotes individually escaped
return String(outer_quotes[1]).replace(/(.)/g, '\\$1') +
// the text inside the outer single quotes is single quoted
"'" + unprintableEscape(inner_string, "'", false).replace(/(?<!\\)''/g, '') + "'" +
// "'" + inner_string + "'" +
// the ending outer quotes individually escaped
String(outer_quotes[3]).replace(/(.)/g, '\\$1');
}
// Only escape the single quotes in strings without metachars or
// separators
return unprintableEscape(String(s).replace(/(')/g, '\\$1'), '', false);
}
if ((/["'\s]/).test(s)) {
return '"' + s.replace(/(["\\$`!])/g, '\\$1') + '"';
}).join(' ');
}

// quote_ascii - Quotes an array of strings to a strict ASCII representation.
// Does the same as quote(), except:
// Escapes Unicode characters to \ux???? sequences
// Suitable when the output will be read or pasted into an environment which is not
// UTF-8 capable (eg, some terminal emulators)
function quote_ascii(xs) {
return xs.map(function (s) {
if (s && typeof s === 'object') {
if (s.op === 'glob') {
return s.pattern;
}
return s.op;
}
else {
s = String(s).replace(/\x00/g, ''); // eat \x00 just like the shell.
if(s === '') {
return "''";
}
// Enclose strings with metacharacters in single quoted,
// and escape any single quotes.
// Match strictly to avoid escaping things that don't need to be.
// bash: | & ; ( ) < > space tab
// Also escapes bash curly brace ranges {a..b} {a..z..3} {1..20} {a,b} but not
// {a...b} or {..a}
else if ((/(?:["\\$`! |&;\(\)<>#]|{[\d]+\.{2}[\d]+(?:\.\.\d+)?}|{[a-zA-Z].{2}[a-zA-Z](?:\.\.\d+)?}|{[^{]*,[^}]*})/m).test(s)) {
// If input contains outer single quote, escape each of them individually.
// eg. 'a b c' -> \''a b c'\'
var outer_quotes = s.match(/^('*)(.*?)('*)$/s);

var inner_string = outer_quotes[2].replace(/'/g, '\'\\\'\'');

// the starting outer quotes individually escaped
return String(outer_quotes[1]).replace(/(.)/g, '\\$1') +
// the text inside the outer single quotes is single quoted
"'" + unprintableEscape(inner_string, "'", true).replace(/(?<!\\)''/g, '') + "'" +
// "'" + inner_string + "'" +
// the ending outer quotes individually escaped
String(outer_quotes[3]).replace(/(.)/g, '\\$1');
}
// Only escape the single quotes in strings without metachars or
// separators
return unprintableEscape(String(s).replace(/(')/g, '\\$1'), '', true);
}
return String(s).replace(/([A-Za-z]:)?([#!"$&'()*,:;<=>?@[\\\]^`{|}])/g, '$1\\$2');
}).join(' ');
}).join(' ');
}

function padWithLeadingZeros(string) {
return new Array(5 - string.length).join("0") + string;
}

function unicodeCharEscape(charCode) {
return "\\u" + padWithLeadingZeros(charCode.toString(16));
}

function escapeCtrlChars(charCode) {
switch (charCode) {
case 8: return '\\b';
case 9: return '\\t';
case 10: return '\\n';
case 11: return '\\v';
case 12: return '\\f';
case 13: return '\\r';
case 27: return '\\E';
default:
if (charCode < 16) return '\\x0' + charCode.toString(16);
else return '\\x' + charCode.toString(16);
}
}

function unprintableEscape(string, quote, ascii) {
return string.split("")
.map(function (char) {
var charCode = char.charCodeAt(0);
return charCode < 32 ? quote + escapeCtrlChars(charCode) + quote :
(ascii && charCode > 127) ? quote + unicodeCharEscape(charCode) + quote : char;
})
.join("");
}

module.exports = {
quote_ascii: quote_ascii,
quote: quote
};
Loading

0 comments on commit 6e9606b

Please sign in to comment.