Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 112 additions & 19 deletions encoding.bs
Original file line number Diff line number Diff line change
Expand Up @@ -1294,16 +1294,20 @@ attribute's getter, when invoked, must return "<code>utf-8</code>".
<h3 id=interface-textencoder>Interface {{TextEncoder}}</h3>

<pre class=idl>
dictionary TextEncoderEncodeIntoResult {
unsigned long long read;
unsigned long long written;
};

[Constructor,
Exposed=(Window,Worker)]
interface TextEncoder {
[NewObject] Uint8Array encode(optional USVString input = "");
TextEncoderEncodeIntoResult encodeInto(USVString source, Uint8Array destination);
};
TextEncoder includes TextEncoderCommon;
</pre>

<p>A {{TextEncoder}} object has an associated <dfn for=TextEncoder>encoder</dfn>.

<p class="note no-backref">A {{TextEncoder}} object offers no <var>label</var> argument as it only
supports <a>UTF-8</a>. It also offers no <code>stream</code> option as no <a for=/>encoder</a>
requires buffering of scalar values.
Expand All @@ -1319,18 +1323,17 @@ requires buffering of scalar values.

<dt><code><var>encoder</var> . <a method for=TextEncoder lt=encode()>encode([<var>input</var> = ""])</a></code>
<dd><p>Returns the result of running <a>UTF-8</a>'s <a for=/>encoder</a>.

<dt><code><var>encoder</var> . <a method=for=TextEncoder lt="encodeInto(source, destination)">encodeInto(<var>source</var>, <var>destination</var>)</a></code>
<dd><p>Runs the <a>UTF-8 encoder</a> on <var>source</var>, stores the result of that operation into
<var>destination</var>, and returns the progress made as a dictionary whereby
{{TextEncoderEncodeIntoResult/read}} is the number of converted <a>code units</a> of
<var>source</var> and {{TextEncoderEncodeIntoResult/written}} is the number of bytes modified in
<var>destination</var>.
</dl>

<p>The <dfn constructor for=TextEncoder id=dom-textencoder><code>TextEncoder()</code></dfn>
constructor, when invoked, must run these steps:

<ol>
<li><p>Let <var>enc</var> be a new {{TextEncoder}} object.

<li><p>Set <var>enc</var>'s <a for=TextEncoder>encoder</a> to <a>UTF-8</a>'s <a for=/>encoder</a>.

<li><p>Return <var>enc</var>.
</ol>
constructor, when invoked, must return a new {{TextEncoder}} object.

<p>The <dfn method for=TextEncoder><code>encode(<var>input</var>)</code></dfn> method, when invoked,
must run these steps:
Expand All @@ -1347,20 +1350,108 @@ must run these steps:
<li><p>Let <var>token</var> be the result of
<a>reading</a> from <var>input</var>.

<li><p>Let <var>result</var> be the result of
<a>processing</a> <var>token</var> for
<a for=TextEncoder>encoder</a>, <var>input</var>, <var>output</var>.
<li><p>Let <var>result</var> be the result of <a>processing</a> <var>token</var> for the
<a>UTF-8 encoder</a>, <var>input</var>, <var>output</var>.

<li>
<p>Assert: <var>result</var> is not <a>error</a>.

<p class=note>The <a>UTF-8 encoder</a> cannot return <a>error</a>.

<li><p>If <var>result</var> is <a>finished</a>, convert <var>output</var> into a byte sequence,
and then return a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing <var>output</var>.
<!-- XXX https://www.w3.org/Bugs/Public/show_bug.cgi?id=26966 -->
</ol>
</ol>

<p>The
<dfn method for=TextEncoder><code>encodeInto(<var>source</var>, <var>destination</var>)</code></dfn>
method, when invoked, must run these steps:

<ol>
<li><p>Let <var>read</var> be 0.

<li><p>Let <var>written</var> be 0.

<li><p>Let <var>destinationBytes</var> be the result of
<a lt="get a reference to the buffer source">getting a reference to the bytes held by</a>
<var>destination</var>.

<li>
<p>Let <var>unused</var> be a new <a for=/>stream</a>.

<p class=note>The <a>handler</a> algorithm invoked below requires this argument, but it is not
used by the <a>UTF-8 encoder</a>.

<li><p>Convert <var>source</var> to a <a for=/>stream</a>.

<li>
<p>While true:

<ol>
<li><p>Let <var>token</var> be the result of <a>reading</a> from <var>source</var>.

<li><p>Let <var>result</var> be the result of running the <a>UTF-8 encoder</a>'s <a>handler</a>
on <var>unused</var> and <var>token</var>.

<li><p>If <var>result</var> is <a>finished</a>, then <a for=iteration>break</a>.

<li>
<p>If <var>result</var> is <a>finished</a>, convert <var>output</var> into a
byte sequence, and then return a {{Uint8Array}} object wrapping an
{{ArrayBuffer}} containing <var>output</var>.
<!-- XXX https://www.w3.org/Bugs/Public/show_bug.cgi?id=26966 -->
<p>Otherwise:

<p class=note><a>UTF-8</a> cannot return <a>error</a>.
<ol>
<li>
<p>If <var>destinationBytes</var>'s <a for="byte sequence">length</a> &minus;
<var>written</var> is greater than or equal to the number of bytes in <var>result</var>, then:

<ol>
<li><p>If <var>token</var> is greater than U+FFFF, then increment <var>read</var> by 2.

<li><p>Otherwise, increment <var>read</var> by 1.

<li><p>Write the bytes in <var>result</var> into <var>destinationBytes</var>, from byte
offset <var>written</var>.

<li><p>Increment <var>written</var> by the number of bytes in <var>result</var>.
</ol>

<li><p>Otherwise, <a for=iteration>break</a>.
</ol>
</ol>

<li><p>Return a new {{TextEncoderEncodeIntoResult}} dictionary whose
{{TextEncoderEncodeIntoResult/read}} member is <var>read</var> and
{{TextEncoderEncodeIntoResult/written}} member is <var>written</var>.
</ol>

<div class=example id=example-textencoder-encodeinto>
<p>The <a method=for=TextEncoder lt="encodeInto(source, destination)">encodeInto()</a> method can
be used to encode a string into an existing {{ArrayBuffer}} object. Various details below are left
as an exercise for the reader, but this demonstrates an approach one could take to use this method:

<pre><code class=lang-javascript>
function convertString(buffer, input, callback) {
let bufferSize = 256,
bufferStart = malloc(buffer, bufferSize),
writeOffset = 0,
readOffset = 0;
while (true) {
const view = new Uint8Array(buffer, bufferStart + writeOffset, bufferSize - writeOffset),
{read, written} = cachedEncoder.encodeInto(input.substring(readOffset), view);
readOffset += read;
writeOffset += written;
if (readOffset === input.length) {
callback(bufferStart, writeOffset);
free(buffer, bufferStart);
return;
}
bufferSize *= 2;
bufferStart = realloc(buffer, bufferStart, bufferSize);
}
}
</code></pre>
</div>


<h3 id=interface-mixin-generictransformstream>Interface mixin {{GenericTransformStream}}</h3>

Expand Down Expand Up @@ -3205,6 +3296,8 @@ Ken Whistler,
Kenneth Russell,
田村健人 (Kent Tamura),
Leif Halvard Silli,
Luke Wagner,
Maciej Hirsz,
Makoto Kato,
Mark Callow,
Mark Crispin,
Expand Down