Skip to content
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.

Commit

Permalink
Abort on Malformed UTF-8 character errors
Browse files Browse the repository at this point in the history
utf8n_to_uvchr_error() only warns on some Malformed UTF-8 characters,
but scan_const needs to error here. Do it with yyerror() which
accumulates all parser errors until it "has too many errors".

Fixes 2 errors in #293, esp.
id:000162,sig:06,src:026278+031045,op:splice,rep:32 and
id:000001,sig:06,src:024259,op:arith8,pos:5,val:+27
which segfaulted in the error handler for
"panic: constant overflowed allocated space"
  • Loading branch information
rurban committed Jul 28, 2017
1 parent 3d0cb2c commit 3d4bad5
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 9 deletions.
6 changes: 6 additions & 0 deletions pod/perldiag.pod
Original file line number Diff line number Diff line change
Expand Up @@ -3532,6 +3532,12 @@ Perhaps the function's author was trying to write a subroutine signature
but didn't enable that feature first (C<use feature 'signatures'>),
so the signature was instead interpreted as a bad prototype.

=item Malformed UTF-8 character

(F) Perl detected one or more fatal UTF-8 errors while parsing a
constant UTF-8 string, which are detailed in the first warnings utf8
message. See L</"Malformed UTF-8 character%s">

=item Malformed UTF-8 character%s

(S utf8)(F) Perl detected a string that should be UTF-8, but didn't
Expand Down
6 changes: 3 additions & 3 deletions toke.c
Original file line number Diff line number Diff line change
Expand Up @@ -4123,9 +4123,9 @@ S_scan_const(pTHX_ char *start)
else if (this_utf8 && has_utf8) { /* Both UTF-8, can just copy */
const STRLEN len = UTF8SKIP(s);

/* We expect the source to have already been checked for
* malformedness */
assert(isUTF8_CHAR((U8 *) s, (U8 *) send));
/* utf8n_to_uvchr_error might have only warned: promote to error */
if (!isUTF8_CHAR((U8 *) s, (U8 *) send))
yyerror("Malformed UTF-8 character");

Copy(s, d, len, U8);
d += len;
Expand Down
12 changes: 6 additions & 6 deletions utf8.c
Original file line number Diff line number Diff line change
Expand Up @@ -810,14 +810,14 @@ Perl__byte_dump_string(pTHX_ const U8 * s, const STRLEN len, const bool format)
PERL_STATIC_INLINE char *
S_unexpected_non_continuation_text(pTHX_ const U8 * const s,

/* How many bytes to print */
STRLEN print_len,
/* How many bytes to print */
STRLEN print_len,

/* Which one is the non-continuation */
const STRLEN non_cont_byte_pos,
/* Which one is the non-continuation */
const STRLEN non_cont_byte_pos,

/* How many bytes should there be? */
const STRLEN expect_len)
/* How many bytes should there be? */
const STRLEN expect_len)
{
/* Return the malformation warning text for an unexpected continuation
* byte. */
Expand Down

0 comments on commit 3d4bad5

Please sign in to comment.