Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC freeform output w/ XML inserts (instead of strict JSON) #6253

Open
1 task done
Pwuts opened this issue Nov 15, 2023 · 8 comments
Open
1 task done

PoC freeform output w/ XML inserts (instead of strict JSON) #6253

Pwuts opened this issue Nov 15, 2023 · 8 comments
Labels
AI efficacy fridge Items that can't be processed right now but can be of use or inspiration later function: prompt generation

Comments

@Pwuts
Copy link
Member

Pwuts commented Nov 15, 2023

Duplicates

  • I have searched the existing issues

Summary 💡

LLMs are bad at generating JSON. Not surprising, because compared to natural language it's like Brainfuck. It's not human-readable without whitespace. Enter XML: it shares syntax with all of the web, and the opening and closing tags make it more readable (and thus more writable too).

Credits to @ntindle for the idea!

Examples 🌈

  • User: I'm curious about countries’ capitals. Like, what's the capital of the U.S.?

  • Assistant:
    I don't know what the capital of the U.S. is. I'll look for this through a web search.

    <use_tool>
      <name>web_search</name>
      <args>
        <question>What is the capital of the U.S.?</question>
      </args>
    </use_tool>
    
  • Function:
    The capital of the U.S. is Washington, D.C.

  • Assistant:
    Let's also feed your curiosity by searching for capitals of 10 more countries!

    <use_tool>
      <name>web_search</name>
      <args>
        <question>Top 10 country capitals</question>
      </args>
    </use_tool>
    
  • Function:
    [insert 10 capitals here]

  • Assistant:
    <answer>The capital of the U.S. is Washington, D.C. Some other national capitals are: [insert 10 capitals here]</answer>

Motivation 🔦

Reasoning: The closer our output format is to the "natural" format that the LLM is trained to output, the higher the reliability and performance can be.

@zedatrix
Copy link
Contributor

zedatrix commented Dec 8, 2023

What about considering markdown? It's even closer to natural language

@ntindle
Copy link
Member

ntindle commented Dec 8, 2023

I considered it but structuring nested markdown seemed non-trivial to identify key value responses when nested and in a list.

Would love an example of the outputs above how you’d imagine them in markdown

@zedatrix
Copy link
Contributor

zedatrix commented Dec 8, 2023

I think there are some quasi markdown formats out there that might be able to help with that e.g.slack has a modified version for their UI builder. I can research and respond with what you asked.

@Pwuts Pwuts modified the milestones: Auto-GPT v0.5.0, AutoGPT v0.6.0 Dec 14, 2023
Copy link
Contributor

github-actions bot commented Feb 3, 2024

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

@github-actions github-actions bot added the Stale label Feb 3, 2024
@Pwuts
Copy link
Member Author

Pwuts commented Feb 12, 2024

Unstale! 🪄

@MKdir98
Copy link
Contributor

MKdir98 commented Feb 12, 2024

Did someone start this issue? Do we need to change promts? I'm working with mistral. I was wondering can I start this one and if tests get passed it shows everything is ok? Or this is more than this?

@github-actions github-actions bot removed the Stale label Feb 13, 2024
@Pwuts
Copy link
Member Author

Pwuts commented Feb 13, 2024

@MKdir98 a PoC for this issue needs changes to both the prompt and the parsing stage. For example you could try converting the OneShotPromptStrategy in autogpt/agents/prompt_strategies/one_shot.py to use XML instead of JSON.
Then if that works in the most basic form, you could try specifying different types of cells (e.g. use_tool, python, answer) and see if the LLM uses them correctly.

Copy link
Contributor

github-actions bot commented Apr 4, 2024

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

@github-actions github-actions bot added the Stale label Apr 4, 2024
@ntindle ntindle removed the Stale label Apr 4, 2024
@Pwuts Pwuts added the fridge Items that can't be processed right now but can be of use or inspiration later label Apr 4, 2024
@Pwuts Pwuts reopened this Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI efficacy fridge Items that can't be processed right now but can be of use or inspiration later function: prompt generation
Projects
Status: Done
Development

No branches or pull requests

5 participants