@@ -325,16 +325,11 @@ Z = erlang:adler32_combine(X,Y,iolist_size(Data2)).</code>
325
325
is <c >latin1</c >, one byte exists for each character
326
326
in the text representation. If <c ><anno >Encoding</anno ></c > is
327
327
<c >utf8</c > or
328
- <c >unicode</c >, the characters are encoded using UTF-8
329
- (that is, characters from 128 through 255 are
330
- encoded in two bytes).</p >
328
+ <c >unicode</c >, the characters are encoded using UTF-8 where
329
+ characters may require multiple bytes.</p >
331
330
<note >
332
- <p ><c >atom_to_binary(<anno >Atom</anno >, latin1)</c > never
333
- fails, as the text representation of an atom can only
334
- contain characters from 0 through 255. In a future release,
335
- the text representation
336
- of atoms can be allowed to contain any Unicode character and
337
- <c >atom_to_binary(<anno >Atom</anno >, latin1)</c > then fails if the
331
+ <p >As from Erlang/OTP 20, atoms can contain any Unicode character
332
+ and <c >atom_to_binary(<anno >Atom</anno >, latin1)</c > may fail if the
338
333
text representation for <c ><anno >Atom</anno ></c > contains a Unicode
339
334
character > 255.</p >
340
335
</note >
@@ -402,13 +397,11 @@ Z = erlang:adler32_combine(X,Y,iolist_size(Data2)).</code>
402
397
translation of bytes in the binary is done.
403
398
If <c ><anno >Encoding</anno ></c >
404
399
is <c >utf8</c > or <c >unicode</c >, the binary must contain
405
- valid UTF-8 sequences. Only Unicode characters up
406
- to 255 are allowed.</p >
400
+ valid UTF-8 sequences.</p >
407
401
<note >
408
- <p ><c >binary_to_atom(<anno >Binary</anno >, utf8)</c > fails if
409
- the binary contains Unicode characters > 255.
410
- In a future release, such Unicode characters can be allowed and
411
- <c >binary_to_atom(<anno >Binary</anno >, utf8)</c > does then not fail.
402
+ <p >As from Erlang/OTP 20, <c >binary_to_atom(<anno >Binary</anno >, utf8)</c >
403
+ is capable of encoding any Unicode character. Earlier versions would
404
+ fail if the binary contained Unicode characters > 255.
412
405
For more information about Unicode support in atoms, see the
413
406
<seealso marker =" erl_ext_dist#utf8_atoms" >note on UTF-8
414
407
encoded atoms</seealso >
@@ -419,9 +412,7 @@ Z = erlang:adler32_combine(X,Y,iolist_size(Data2)).</code>
419
412
> <input >binary_to_atom(<< "Erlang">> , latin1).</input >
420
413
'Erlang'
421
414
> <input >binary_to_atom(<< 1024/utf8>> , utf8).</input >
422
- ** exception error: bad argument
423
- in function binary_to_atom/2
424
- called as binary_to_atom(<< 208,128>> ,utf8)</pre >
415
+ 'Ѐ'</pre >
425
416
</desc >
426
417
</func >
427
418
@@ -2401,10 +2392,10 @@ os_prompt%</pre>
2401
2392
<desc >
2402
2393
<p >Returns the atom whose text representation is
2403
2394
<c ><anno >String</anno ></c >.</p >
2404
- <p ><c ><anno >String</anno ></c > can only contain ISO-latin-1
2405
- characters (that is, numbers < 256) as the implementation does not
2406
- allow Unicode characters equal to or above 256 in atoms.
2407
- For more information on Unicode support in atoms, see
2395
+ <p >As from Erlang/OTP 20, <c ><anno >String</anno ></c > may contain
2396
+ any Unicode character. Earlier versions allowed only ISO-latin-1
2397
+ characters as the implementation did not allow Unicode characters
2398
+ above 255. For more information on Unicode support in atoms, see
2408
2399
<seealso marker =" erl_ext_dist#utf8_atoms" >note on UTF-8
2409
2400
encoded atoms</seealso >
2410
2401
in section "External Term Format" in the User's Guide.</p >
0 commit comments