-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code extraction fix #1834
Code extraction fix #1834
Conversation
Could you write some tests for this? Also, could you check if we also need to update |
I think @BeibinLi or @LittleLittleCloud added some way to extract single line of code before. |
yes have tested using MarkdownCodeExtractor and it works perfectly now. Test with a user proxy agent and a assistant with a system message containing a single line sh example like below : "Give all installations commands in a single line for the code you suggest in this format ```sh pip3 install Flask flask_sqlalchemy flask_bcrypt flask_login ``` " |
The new changes allow code block with 3 "`" signs to be written in a single line, which is good. However, it still cannot resolve standard code blocks with one "`" sign in a single line. I have a single-line detection in line 124: code_pattern = re.compile(CODE_BLOCK_PATTERN + r"|`([^`]+)`") Should we combine your changes with the single line detection logic we had earlier? I am open to any suggestions regarding CODE_BLOCK_PATTERN or Another question, what would be the result for:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow 0 or more "\n" characters. Should we also allow 0 or more "\r" characters?
They are the same at the end of the day (depends on which encoding method or operation system is used for the text file).
I think we need to be a bit more careful in changing the default behavior of code block extraction. Currently we only extract well-formed multi-line code blocks surrounded with three backticks "``". This is instruction is not hard for LLMs like GPT-4 to follow but could be an issue for open source LMs. I suggest we provide options in the code executors to turn this on, rather than changing the default behavior. |
@marklysze @GregorD1A1 how's your experience using open source models when it comes to follow code block formatting instructions. |
In my testing of open source models with code generation (based on these notebooks): I found a mix of outputs. Enhanced support for #3 and #4 below would be good. I'm not sure #2 can be accommodated for easily. Note: I'll space out the "```" so it shows correctly in this comment.
Apart from those, there were a mix of outputs.
See the output files that have been generated here for examples: |
@ekzhu I didn't tested code extraction for open source models. Hovewer, at some point I come to the conclusion, that it's better to... not use code extraction at all. Instead, I defined few tools as "insert code after line x", "replace code in lines", "create new file with code". All the file operations defined in python level inside the tools. Such function calling approach (at least at my application, where I'm creating web app) provided much better results - while with code extractor there was often situation when LLM removed half of the existing file ocasionally, with function calling were not so big mistakes. Also, function calling approach is safier, LLM can do only predefined list of things, so not requires docker to protect OS. So my advice based on practical experience - use function calling instead of code extraction when possible. Althrough I understand there are different applications of Autogen and sometimes code extractor is needed, so it's good @PranitModak improving that feature. To test it with open source LLM, I recommend to use server feature in LM Studio. |
Thanks! Would you be able to contribute a notebook on using function calling for writing code? Are you using function calling to execute the written code as well? |
@ekzhu ok, I can prepare such. I'm not using function calling to run code, in my case I used it to improve code for FastApi and Vue.js applications, both of which was running as separate processes and got automatically reloaded after changes. |
I see. I a just trying to understand the usage case. So is it right to say that the use case is given a code file to the agent, and the agent suggests changes to the code via tool call, which then makes the actual changes. Once the changes are made, the application whose source code is changed got reloaded. |
Yes, that's right. I'll try to explain the thing in notebook. |
@PranitModak as discussed above, let's make the this optional in the markdown code extractor class in |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
12853598 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | e43a86c | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | bdb40d7 | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | 954ca45 | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10404662 | Triggered | Generic CLI Secret | eff19ac | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 06a0a5d | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 0524c77 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | d7ea410 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | e43a86c | .github/workflows/dotnet-build.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 841ed31 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 802f099 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 9a484d8 | .github/workflows/dotnet-build.yml | View secret |
10404662 | Triggered | Generic CLI Secret | e973ac3 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 89650e7 | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | e07b06b | .github/workflows/dotnet-release.yml | View secret |
10404662 | Triggered | Generic CLI Secret | abe4c41 | .github/workflows/dotnet-build.yml | View secret |
10404662 | Triggered | Generic CLI Secret | 7362fb9 | .github/workflows/dotnet-release.yml | View secret |
12853599 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10404694 | Triggered | Generic High Entropy Secret | e43a86c | test/oai/test_utils.py | View secret |
10404694 | Triggered | Generic High Entropy Secret | 954ca45 | test/oai/test_utils.py | View secret |
10404694 | Triggered | Generic High Entropy Secret | bdb40d7 | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | abad9ff | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | 954ca45 | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | c7bb588 | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | b97b99d | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | e43a86c | test/oai/test_utils.py | View secret |
12853600 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
12853601 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10493810 | Triggered | Generic Password | 49e8053 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 501610b | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 49e8053 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 501610b | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | d422c63 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 97fa339 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 49e8053 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | d422c63 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 97fa339 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | d422c63 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 97fa339 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 501610b | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10404696 | Triggered | Generic High Entropy Secret | 954ca45 | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | bdb40d7 | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | e43a86c | test/oai/test_utils.py | View secret |
10422482 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
10422482 | Triggered | Generic High Entropy Secret | bdb40d7 | test/oai/test_utils.py | View secret |
12853602 | Triggered | Generic High Entropy Secret | 79dbb7b | test/oai/test_utils.py | View secret |
11616921 | Triggered | Generic High Entropy Secret | a86d0fd | notebook/agentchat_agentops.ipynb | View secret |
11616921 | Triggered | Generic High Entropy Secret | 394561b | notebook/agentchat_agentops.ipynb | View secret |
11616921 | Triggered | Generic High Entropy Secret | 3eac646 | notebook/agentchat_agentops.ipynb | View secret |
11616921 | Triggered | Generic High Entropy Secret | f45b553 | notebook/agentchat_agentops.ipynb | View secret |
11616921 | Triggered | Generic High Entropy Secret | 6563248 | notebook/agentchat_agentops.ipynb | View secret |
12853598 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
12853598 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | 0a3c6c4 | test/oai/test_utils.py | View secret |
10404693 | Triggered | Generic High Entropy Secret | 76f5f5a | test/oai/test_utils.py | View secret |
10404662 | Triggered | Generic CLI Secret | 954ca45 | .github/workflows/dotnet-build.yml | View secret |
12853599 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
12853599 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
10404694 | Triggered | Generic High Entropy Secret | 76f5f5a | test/oai/test_utils.py | View secret |
10404694 | Triggered | Generic High Entropy Secret | 0a3c6c4 | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | 3b79cc6 | test/oai/test_utils.py | View secret |
10404695 | Triggered | Generic High Entropy Secret | 11baa52 | test/oai/test_utils.py | View secret |
12853600 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
12853600 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
12853601 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
12853601 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
10493810 | Triggered | Generic Password | 3b79cc6 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 11baa52 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 11baa52 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10493810 | Triggered | Generic Password | 3b79cc6 | notebook/agentchat_pgvector_RetrieveChat.ipynb | View secret |
10404696 | Triggered | Generic High Entropy Secret | 0a3c6c4 | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | 76f5f5a | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
10404696 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
10422482 | Triggered | Generic High Entropy Secret | 2b3a9ae | test/oai/test_utils.py | View secret |
10422482 | Triggered | Generic High Entropy Secret | c03558f | test/oai/test_utils.py | View secret |
and 16 others.
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@PranitModak are you still interested in working on this? |
Looks like this is dormant, if you want to pick this up again please feel free to reopen. Thanks! |
Why are these changes needed?
Current Code Extractor is not able to extract single like codes given by agent.
It ideally should be able to extract any code given by the agent if it has a correct markdown format.
Related issue number
Checks
Current regex do not extract single line codes given by agent like below.
Please install required modules with ```sh pip3 install Flask flask_sqlalchemy flask_bcrypt flask_login ``` and execute the following Python code which uses Flask: