Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magic Command Prompt Interpolation Proposal #1260

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

francoculaciati
Copy link

Issue #27

Disclaimer: This update introduces breaking changes to the previous templating format. Existing notebooks and code using the old {variable} syntax will need to be updated. Consider implementing a backwards compatibility mode or making the delimiter configurable if legacy support is required.

Prompt interpolation is broken when using the %%ai magic command due to the direct use of Python's .format_map() method on user inputs. This leads to errors when variables contain special characters, such as curly braces {}, which interfere with string formatting.

Examples:

Case 1: Direct String Formatting Fails

Input:

%%ai  
explain this code  

print({'foo': 'bar'})

Error:

ValueError: Invalid format specifier ' 'bar'' for object of type 'str'

The error occurs when attempting to use .format_map() on a dictionary that contains curly braces.

Case 2: Namespace Interpolation Causes Errors

Input:

%%ai  
explain this Python code:  

{In[3]}

Error:

ValueError: Invalid format specifier ' 'bar'' for object of type 'str'

This error appears later in execution because {In[3]} attempts to access an object from the IPython namespace, but the format string does not handle this case correctly.

Both cases demonstrate that .format_map() does not safely handle certain inputs, leading to errors at different execution stages.


Proposed Solution

1. Change the Interpolation Syntax

The updated implementation introduces a new syntax for placeholders:

  • Old syntax: {variable}
  • New syntax: @{variable}

This change is reflected in the updated PromptStr class below. The class processes the input to:

  • Convert custom placeholders (e.g. @{var}) into standard placeholders ({var}) for interpolation.
  • Double all other literal curly braces so that they are preserved literally.
  • Raise a ValueError if any custom placeholder contains extra or nested curly braces.
class PromptStr(str):
    """
    A string subclass that processes its content to support a custom
    placeholder delimiter. Custom placeholders are marked with "@{...}".
    
    When format() or format_map() is called, the instance is first processed:
      - Custom placeholders (e.g. "@{var}") are converted into standard
        placeholders ("{var}") for interpolation.
      - All other literal curly braces are doubled (e.g. "{" becomes "{{")
        so that they are preserved literally.
    
    If any custom placeholder contains additional curly braces (i.e. nested
    braces), a ValueError is raised.
    """
    def __init__(self, text):
        self._template = self._process_template(text)

    @staticmethod
    def _process_template(template: str) -> str:
        """
        Process the input template so that:
          - Any custom placeholder of the form "@{...}" is converted into
            a normal placeholder "{...}".
          - All other literal curly braces are doubled so that they remain
            unchanged during formatting.
          
        Assumes that the custom placeholder does not contain nested braces.
        If nested or extra curly braces are found within a custom placeholder,
        a ValueError is raised.
        """
        # Pattern to match custom placeholders: "@{...}" where ... has no braces.
        pattern = r'@{([^{}]+)}'
        tokens = []
    
        def token_replacer(match):
            inner = match.group(1)
            assert ("{" not in inner) and ("}" not in inner)
            tokens.append(inner)
            return f'<<<{len(tokens)-1}>>>'
    
        template_with_tokens = re.sub(pattern, token_replacer, template)
        if "@{" in template_with_tokens:
            raise ValueError("Curly braces are not allowed inside custom placeholders.")
    
        escaped = template_with_tokens.replace("{", "{{").replace("}", "}}")
        for i, token in enumerate(tokens):
            escaped = escaped.replace(f'<<<{i}>>>', f'{{{token}}}')
        return escaped
    
    def format(self, *args, **kwargs):
        return self._template.format(*args, **kwargs)
    
    def format_map(self, mapping):
        return self._template.format_map(mapping)

2. Remove the Second Interpolation Step

Previously, prompt interpolation occurred twice:

  1. Once in the ai method.
  2. Again inside run_ai_cell.

The redundant second interpolation has been removed. Now, interpolation happens only once via PromptStr.format_map(ip.user_ns), ensuring a single, safe transformation of the input.


Limitations

  1. Breaking Changes to the Previous Templating Format

    • Impact: Existing code using the old {variable} syntax will break.
    • Consideration: A backwards compatibility mode could be implemented, or the delimiter could be made configurable via user settings.
  2. No Python Expressions Inside Interpolation

    • Unlike f-strings, expressions like @{x + y} are not supported.
    • Only simple variable names are allowed (@{variable}).
  3. No Nested Interpolations

    • Custom placeholders containing nested curly braces (e.g. "@{this_is {not allowed}}") will raise an error.
    • Potential Fix: A future enhancement might allow nested expressions with proper parsing, but for now it is strictly disallowed.
  4. Curly Braces Inside Interpolation Not Allowed

    • Any attempt to use { or } inside an @{} block will raise an exception.
  5. Delimiter Conflicts with Other Languages/Frameworks

    • The chosen delimiter @{} may conflict with other languages or templating engines that already use this syntax (e.g., certain JavaScript frameworks or Razor syntax in .NET).
    • Workaround: Offering a configuration option to customize the delimiter would mitigate this issue.
  6. Lack of Configurability

    • Currently, the template delimiter is hardcoded.
    • Enhancement: Future versions might expose a configuration parameter so users can set their own delimiter if the default conflicts with their use case or other tooling.
  7. Migration and Documentation Overhead

    • Developers will need to update their notebooks and code to adopt the new syntax.
    • Comprehensive documentation and migration guides will be necessary to ease the transition.

Reference

This proposal addresses the problems detailed in Issue #27.


Behaviour after changes

When using the new interpolation rules, the following examples illustrate the updated behavior:

Successful Interpolation Example

%%ai ollama:deepseek-r1:32b
make a simple an short explanation of these two snnipets of code
print({'foo': 'bar'})
@{In[2]}

Produces:

Okay, I need to explain the user's code snippets simply and concisely. The first line prints a dictionary with 'foo' as key and 'bar' as value. The second does something similar but with 'oof' and 'rab'.

I should note that each print statement outputs the dictionary in its standard format, which includes curly braces, quotes around keys and values, and colons. I'll mention that both are dictionaries but with different key-value pairs.

Since the user wants markdown output only, my explanation will be straightforward without any formatting beyond what's necessary.

The two lines of code print dictionaries to the console:

- `print({'foo': 'bar'})` outputs: `{foo: bar}`
- `print({'oof': 'rab'})` outputs: `{oof: rab}`

Both are Python dictionary literals with different key-value pairs.

Error Cases

When a custom placeholder includes extra curly braces, a ValueError is raised.

Example 1:

%%ai ollama:deepseek-r1:32b
make a simple an short explanation of these two snnipets of code
"@{this_is {not allowed}}"

Produces:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: Curly braces are not allowed inside custom placeholders.

Example 2:

%%ai ollama:deepseek-r1:32b
make a simple an short explanation of these two snnipets of code
"@{this_is @{not_allowed}} either"

Produces:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: Curly braces are not allowed inside custom placeholders.

@francoculaciati francoculaciati changed the title Prompt Interpolation Proposal Magic Command Prompt Interpolation Proposal Feb 23, 2025
@dlqqq dlqqq added the enhancement New feature or request label Feb 24, 2025
@dlqqq
Copy link
Member

dlqqq commented Feb 24, 2025

@francoculaciati Wow, this is a big change! I'll review this PR tomorrow. Did you explore @krassowski's suggestion in issue #27 to use the @no_var_expand decorator? There may be a way to do this more easily with IPython APIs, so I think it would be smart to check if those meet the need here.

@francoculaciati
Copy link
Author

I think that the @no_var_expand decorator works only on the parameters of the magic command, which seems to be everything in the same line as the command, and not the content of the whole code cell.

@dlqqq dlqqq linked an issue Feb 24, 2025 that may be closed by this pull request
@krassowski
Copy link
Member

Some thoughts:

  • could this use standard {} syntax but be just opt-in or opt-out via magic flag (e.g. if formatting was opt-out, any error should tell user how to opt-out)?
  • if we need to diverge from Python syntax (I would rather not), could we use ${} syntax?
  • it feels like this would benefit from extensive unit tests

Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@francoculaciati Thank you for working on this PR! You've come up with an interesting strategy to solve the issue. The code looks correct and well-documented, which is very much appreciated. ❤️

As you called out, this PR introduces a breaking change to the variable interpolation syntax that we support. Existing notebooks using jupyter_ai_magics would be broken by upgrading to a release including this change. Therefore, I don't think this PR can be backported to 2.x as it currently stands, i.e. this wouldn't be available to users until Jupyter AI v3.0.0 if we merge this as-is.

However, it may be possible to solve the original issue in a backwards-compatible way. The original issue is that curly braces may be nested within each other, which causes a runtime error. However, we may be able to solve this by using a regex that only matches valid Python variable names inside of curly braces. Here is that regex:

\{[a-zA-Z_][a-zA-Z0-9_]*\}

We can iterate through each of the matches and replace the match if the variable name is in the ip.user_ns dictionary. This seems more simple, performant, and has the added benefit of being backwards-compatible.

@francoculaciati Would you be interested in working on this? This probably can be done in a separate PR since the approach is entirely different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when magic command prompt includes braces / curly brackets
3 participants