Replace Fastchat Chat Templates with HF `apply_chat_template` by mgoin · Pull Request #1499 · neuralmagic/deepsparse

mgoin · 2023-12-21T19:09:39Z

Inspired by vllm-project/vllm#1365 and instigated by prefill benchmarking, using the HF Chat Templates hidden within the model's tokenizer seems to be a more elegant approach.

From a performance perspective, fastchat's default template adds over 400 tokens as a preamble to chat conversations, without actually using the right prompt format the model was trained on! Example below:

### Human: Got any creative ideas for a 10 year old’s birthday?
### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:
1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.
2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.
3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.
4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.
5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.
6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.
7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.
8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.
Remember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!
### Human:

All of the above is just a default preamble, and then it proceeds to contain it within a ### Human: ... ### Assistant: format is not correct for our tested hf:neuralmagic/TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds, which prefers <s>[INST] ... <<SYS>> ... <</SYS>> ... [/INST]. HF tokenizer.apply_chat_template has no problem generating that right format and integrating the roles properly in the same list of dictionaries format.

The base branch was changed.

dsikka

LGTM pending merge conflicts fixed

mgoin added 5 commits December 21, 2023 17:42

Let OpenAI ChatCompletionRequest accept List[Dict] messages

e547610

Test

83a772c

Fix list parsing when building template

0414a5a

More error messages

646a640

Replace Fastchat Chat Templates with HF

7581e25

mgoin changed the base branch from main to fix-chat-protocol December 21, 2023 19:10

mgoin requested review from Satrat, bfineran and dsikka December 21, 2023 19:14

mgoin changed the title ~~Replace Fastchat Chat Templates with HF~~ Replace Fastchat Chat Templates with HF apply_chat_template Dec 21, 2023

Update setup.py

56dc6db

rahul-tuli previously approved these changes Dec 21, 2023

View reviewed changes

dsikka reviewed Dec 21, 2023

View reviewed changes

Comment thread src/deepsparse/server/openai_server.py Outdated

Comment thread src/deepsparse/server/openai_server.py Outdated

Base automatically changed from fix-chat-protocol to main December 21, 2023 19:35

Fix non-lists and add testing

34ad42b

dsikka previously approved these changes Dec 21, 2023

View reviewed changes

Merge branch 'main' into replace-fastchat

8a57982

mgoin dismissed dsikka’s stale review via 8a57982 December 21, 2023 21:46

mgoin added 6 commits December 26, 2023 16:37

Cleanup warn->info

1e72392

Update test-check.yaml

bfa3e5d

Update test deps

d303e77

Update test deps

8e081f9

Add fallback for chat template as ChatML

3ef9959

Merge branch 'main' into replace-fastchat

3f42880

bfineran approved these changes Dec 27, 2023

View reviewed changes

mgoin merged commit 4b7645f into main Dec 27, 2023

mgoin deleted the replace-fastchat branch December 27, 2023 20:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Fastchat Chat Templates with HF `apply_chat_template`#1499

Replace Fastchat Chat Templates with HF `apply_chat_template`#1499
mgoin merged 14 commits into
mainfrom
replace-fastchat

mgoin commented Dec 21, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

dsikka left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mgoin commented Dec 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mgoin commented Dec 21, 2023 •

edited

Loading