feat(tools): Parallel function calling #1726

mudler · 2024-02-19T18:48:30Z

Description

This PR fixes #1275 and provides a drop-in support for https://platform.openai.com/docs/guides/function-calling/parallel-function-calling

This feature is gated by a config option that needs to be explicitly enabled in the model YAML configuration file as it is still experimental:

function:
  parallel_calls: true

Notes for Reviewers

Signed commits

Yes, I signed my commits.

This PR didn't went through testing (yet)

netlify · 2024-02-19T18:48:46Z

✅ Deploy Preview for localai canceled.

Name	Link
🔨 Latest commit	`ab7603c`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/65d4efdea4d4b10008ec9b93

mudler · 2024-02-19T19:14:31Z

@stippi2 this is quite experimental here, but I'd be happy if you can give some early feedback - at least to double check if at API level we are doing the correct thing

stippi2 · 2024-02-20T08:48:49Z

Thanks for the heads-up! Will give it a shot.

stippi2 · 2024-02-20T08:54:34Z

BTW, last time you mentioned I was running the wrong model for successful function calling. Can you recommend one I should test with? I am still learning what I should look out for in a model description. I have a 12 core Windows PC with a 3090 in it, and an i9 8 core Intel MacBook. Both computers have 32 GiG of RAM and the 3090 has 24 GiG of VRAM. My setup for building LocalAI from source in on the MacBook, so if there is a model I could try on the Mac, that would be best. Thanks for any pointers.

Fixes: #1275

mudler · 2024-02-20T18:36:09Z

BTW, last time you mentioned I was running the wrong model for successful function calling. Can you recommend one I should test with? I am still learning what I should look out for in a model description. I have a 12 core Windows PC with a 3090 in it, and an i9 8 core Intel MacBook. Both computers have 32 GiG of RAM and the 3090 has 24 GiG of VRAM. My setup for building LocalAI from source in on the MacBook, so if there is a model I could try on the Mac, that would be best. Thanks for any pointers.

I had good results with WizardLM 30b. However, I think you can find models which are fine-tuned against dataset which have more training data with functions. For instance, good ones that pops in my mind are:

More broadly, models that are fine-tuned against https://huggingface.co/datasets/togethercomputer/glaive-function-calling-v2-formatted have more chances to get this right.

To note, best results are with models >=70b with my tests. Speculative sampling can help here as well.

Also: I got good results only when I started to construct a good prompt template explaining the tools available, what they do, and how should be used with some example.

mudler · 2024-02-20T18:36:50Z

I've been testing locally this with mistral openorca, and seems to work:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "mistral-openorca", 
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston and in san francisco?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}' | jq

result:

{           
  "created": 1708453855,
  "object": "chat.completion",
  "id": "1e8099ca-e73d-49a6-b2c8-2f0e29b77e12",
  "model": "mistral-openorca",        
  "choices": [                                                                                                        
    {                  
      "index": 0,          
      "finish_reason": "tool_calls",
      "message": {       
        "role": "assistant",   
        "content": null,                                                                                              
        "tool_calls": [
          {          
            "index": 0,        
            "id": "1e8099ca-e73d-49a6-b2c8-2f0e29b77e12",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\":\"Boston\",\"unit\":\"fahrenheit\"}"
            }
          },
          {
            "index": 0,
            "id": "1e8099ca-e73d-49a6-b2c8-2f0e29b77e12",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\":\"San Francisco\",\"unit\":\"fahrenheit\"}"
            }
          }
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

….0 by renovate (#18546) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2-cublas-cuda11-ffmpeg-core` -> `v2.9.0-cublas-cuda11-ffmpeg-core` | | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2-cublas-cuda11-core` -> `v2.9.0-cublas-cuda11-core` | | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2-cublas-cuda12-ffmpeg-core` -> `v2.9.0-cublas-cuda12-ffmpeg-core` | | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2-cublas-cuda12-core` -> `v2.9.0-cublas-cuda12-core` | | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2-ffmpeg-core` -> `v2.9.0-ffmpeg-core` | | [docker.io/localai/localai](https://github.com/mudler/LocalAI) | minor | `v2.8.2` -> `v2.9.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>mudler/LocalAI (docker.io/localai/localai)</summary> ### [`v2.9.0`](https://github.com/mudler/LocalAI/releases/tag/v2.9.0) [Compare Source](https://github.com/mudler/LocalAI/compare/v2.8.2...v2.9.0) This release brings many enhancements, fixes, and a special thanks to the community for the amazing work and contributions! We now have sycl images for Intel GPUs, ROCm images for AMD GPUs,and much more: - You can find the AMD GPU images tags between the container images available - look for `hipblas`. For example, [master-hipblas-ffmpeg-core](https://quay.io/repository/go-skynet/local-ai/tag/master-hipblas-ffmpeg-core). Thanks to [@fenfir](https://github.com/fenfir) for this nice contribution! - Intel GPU images are tagged with `sycl`. You can find images with two flavors, sycl-f16 and sycl-f32 respectively. For example, [master-sycl-f16](https://quay.io/repository/go-skynet/local-ai/tag/master-sycl-f16-core). Work is in progress to support also diffusers and transformers on Intel GPUs. - Thanks to [@christ66](https://github.com/christ66) first efforts in supporting the Assistant API were made, and we are planning to support the Assistant API! Stay tuned for more! - Now LocalAI supports the Tools API endpoint - it also supports the (now deprecated) functions API call as usual. We now also have support for SSE with function calling. See [https://github.com/mudler/LocalAI/pull/1726](https://github.com/mudler/LocalAI/pull/1726) for more - Support for Gemma models - did you hear? Google released OSS models and LocalAI supports it already! - Thanks to [@dave-gray101](https://github.com/dave-gray101) in [https://github.com/mudler/LocalAI/pull/1728](https://github.com/mudler/LocalAI/pull/1728) to put efforts in refactoring parts of the code - we are going to support soon more ways to interface with LocalAI, and not only restful api! ##### Support the project First off, a massive thank you to each and every one of you who've chipped in to squash bugs and suggest cool new features for LocalAI. Your help, kind words, and brilliant ideas are truly appreciated - more than words can say! And to those of you who've been heros, giving up your own time to help out fellow users on Discord and in our repo, you're absolutely amazing. We couldn't have asked for a better community. Just so you know, LocalAI doesn't have the luxury of big corporate sponsors behind it. It's all us, folks. So, if you've found value in what we're building together and want to keep the momentum going, consider showing your support. A little shoutout on your favorite social platforms using [@LocalAI_OSS](https://twitter.com/LocalAI_API) and [@mudler_it](https://twitter.com/mudler_it) or joining our sponsorship program can make a big difference. Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy Every bit of support, every mention, and every star adds up and helps us keep this ship sailing. Let's keep making LocalAI awesome together! Thanks a ton, and here's to more exciting times ahead with LocalAI! 🚀 ##### What's Changed ##### Bug fixes 🐛 - Add TTS dependency for cuda based builds fixes [#1727](https://github.com/mudler/LocalAI/issues/1727) by [@blob42](https://github.com/blob42) in [https://github.com/mudler/LocalAI/pull/1730](https://github.com/mudler/LocalAI/pull/1730) ##### Exciting New Features 🎉 - Build docker container for ROCm by [@fenfir](https://github.com/fenfir) in [https://github.com/mudler/LocalAI/pull/1595](https://github.com/mudler/LocalAI/pull/1595) - feat(tools): support Tool calls in the API by [@mudler](https://github.com/mudler) in [https://github.com/mudler/LocalAI/pull/1715](https://github.com/mudler/LocalAI/pull/1715) - Initial implementation of upload files api. by [@christ66](https://github.com/christ66) in [https://github.com/mudler/LocalAI/pull/1703](https://github.com/mudler/LocalAI/pull/1703) - feat(tools): Parallel function calling by [@mudler](https://github.com/mudler) in [https://github.com/mudler/LocalAI/pull/1726](https://github.com/mudler/LocalAI/pull/1726) - refactor: move part of api packages to core by [@dave-gray101](https://github.com/dave-gray101) in [https://github.com/mudler/LocalAI/pull/1728](https://github.com/mudler/LocalAI/pull/1728) - deps(llama.cpp): update, support Gemma models by [@mudler](https://github.com/mudler) in [https://github.com/mudler/LocalAI/pull/1734](https://github.com/mudler/LocalAI/pull/1734) ##### 👒 Dependencies - deps(llama.cpp): update by [@mudler](https://github.com/mudler) in [https://github.com/mudler/LocalAI/pull/1714](https://github.com/mudler/LocalAI/pull/1714) - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://github.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1740](https://github.com/mudler/LocalAI/pull/1740) ##### Other Changes - ⬆️ Update docs version mudler/LocalAI by [@localai-bot](https://github.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1718](https://github.com/mudler/LocalAI/pull/1718) - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://github.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1705](https://github.com/mudler/LocalAI/pull/1705) - Update README.md by [@lunamidori5](https://github.com/lunamidori5) in [https://github.com/mudler/LocalAI/pull/1739](https://github.com/mudler/LocalAI/pull/1739) - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://github.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1750](https://github.com/mudler/LocalAI/pull/1750) ##### New Contributors - [@fenfir](https://github.com/fenfir) made their first contribution in [https://github.com/mudler/LocalAI/pull/1595](https://github.com/mudler/LocalAI/pull/1595) - [@christ66](https://github.com/christ66) made their first contribution in [https://github.com/mudler/LocalAI/pull/1703](https://github.com/mudler/LocalAI/pull/1703) - [@blob42](https://github.com/blob42) made their first contribution in [https://github.com/mudler/LocalAI/pull/1730](https://github.com/mudler/LocalAI/pull/1730) **Full Changelog**: mudler/LocalAI@v2.8.2...v2.9.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 10pm on monday" in timezone Europe/Amsterdam, Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).

mudler added the enhancement New feature or request label Feb 19, 2024

mudler mentioned this pull request Feb 19, 2024

call multiple functions in one message #1275

Closed

mudler changed the title ~~feat(tools): support returning multiple tools choices~~ feat(tools): Parallel function calling Feb 19, 2024

mudler force-pushed the func_multiple branch from 474be2a to 16f0b50 Compare February 19, 2024 19:18

feat(tools): support returning multiple tools choices

ab7603c

Fixes: #1275

mudler force-pushed the func_multiple branch from 16f0b50 to ab7603c Compare February 20, 2024 18:30

mudler merged commit 960d314 into master Feb 20, 2024

mudler deleted the func_multiple branch February 20, 2024 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): Parallel function calling #1726

feat(tools): Parallel function calling #1726

mudler commented Feb 19, 2024 •

edited

Loading

netlify bot commented Feb 19, 2024 •

edited

Loading

mudler commented Feb 19, 2024

stippi2 commented Feb 20, 2024

stippi2 commented Feb 20, 2024

mudler commented Feb 20, 2024

mudler commented Feb 20, 2024 •

edited

Loading

feat(tools): Parallel function calling #1726

feat(tools): Parallel function calling #1726

Conversation

mudler commented Feb 19, 2024 • edited Loading

netlify bot commented Feb 19, 2024 • edited Loading

✅ Deploy Preview for localai canceled.

mudler commented Feb 19, 2024

stippi2 commented Feb 20, 2024

stippi2 commented Feb 20, 2024

mudler commented Feb 20, 2024

mudler commented Feb 20, 2024 • edited Loading

mudler commented Feb 19, 2024 •

edited

Loading

netlify bot commented Feb 19, 2024 •

edited

Loading

mudler commented Feb 20, 2024 •

edited

Loading