Skip to content

If a quoted escape sequence hits the end of the string, the last character is duplicated #15

@movermeyer-stripe

Description

@movermeyer-stripe

Description

The parser duplicates the last character of a pattern when a single quote is used to escape a single syntax character, and there are no more single quotes in the rest of the pattern.

This results in incorrect output where the final character appears twice.

Steps to Reproduce

require 'message_format'

pattern = "Hello '{literal}!"
message = MessageFormat.new(pattern, 'en-US').format()
puts message

Expected Behaviour

The pattern Hello '{literal}! should output: Hello {literal}!

The '{ escapes the opening curly brace, treating {literal}! as literal text until the pattern ends.

Actual Behaviour

The pattern produces: Hello {literal}!!

Notice the exclamation mark is duplicated.

Root Cause

The bug exists in lib/message_format/parser.rb in the parse_text method (lines 89-109). When processing a quoted section:

  1. A single quote followed by a special character (e.g., '{) enters the quoted section handler
  2. The inner while loop (while @index + 1 < @length) processes characters looking for a closing quote
  3. When no closing quote is found, the loop exits with @index pointing to the last processed character
  4. Control returns to the outer while loop, which then reprocesses the same character, causing duplication

Test Case

A test case demonstrating the expected behaviour:

it 'handles escaped curly brace' do
  pattern = 'Hello \'{literal}!'
  message = MessageFormat.new(pattern, 'en-US').format()

  expect(message).to eql('Hello {literal}!')
end

Environment

  • Ruby version:
    $ ruby --version
    ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin23]
    
  • message-format-rb version: 0.0.8
  • OS:
    $ uname -a
    Darwin 25.1.0 Darwin Kernel Version 25.1.0: Mon Oct 20 19:33:36 PDT 2025; root:xnu-12377.41.6~2/RELEASE_ARM64_T6030 arm64
    

Additional Context

This bug affects any pattern where:

  • A single quote precedes a special character ({, }, or # in plural contexts)
  • The pattern contains no additional single quotes

I'm fairly certain that this is valid ICU MessageFormat syntax. Indeed, it parses as expected on format-message.github.io
But even if it weren't, the correct behaviour would be to raise, and not to duplicate the last character.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions