Adding additional tokens to vocabulary

I am using models like `EleutherAI/gpt-j-6B` and `llama-7b-hf` for text generation.
I have added special tokens to the vocabulary as I want a structured output.

Prompt
`"<|begincontext|>I want to make a restaurant reservation for 2 people at half past 11 in the morning.<|endcontext|>",
`
Target
`"<|begintarget|><|begindsts|><|begindst|><|beginintent|>FindRestaurants<|endintent|><|beginbelief|><|endbelief|><|enddst|><|enddsts|><|beginuseraction|>INFORM_INTENT->Restaurants^intent~FindRestaurants<|enduseraction|><|beginaction|>REQUEST->Restaurants^city~<|endaction|><|beginresponse|>Do you have a specific which you want the eating place to be located at?<|endresponse|><|endtarget|>"`

I have an example Colab Notebook 
https://colab.research.google.com/drive/16qKy92cGoNPWrlQ4zlvntVGeSgjrknVF?usp=sharing

I am able to train the model without any errors. 
However, when I perform inference, it does not produce any structured output, it just produces some random generation.

Here is a sample generation
`<|endintent|>  I\'ll make the reservation for 6 o"clock in the evening, for two people. I\'ll make the reservation for 6 o"clock in the evening, for two people. I\'ll make the reservation for 6 o"clock in the evening, for two people.
`

In my original code, when I train on a lot of data and plot the train/eval loss I can see that the train/eval loss decreases to  values, `train_loss=0.2163, eval_loss = 0.2416`. With such low loss values, I am surprised why the generation has absolutely no structure. With a GPT-2 model, training for a few steps with a small amount of data produces a structured output.

This issue #326 talks about additional tokens in the vocabulary, which is similar to what I want to do.

Can you please give me some pointers on where I am going wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding additional tokens to vocabulary #334

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding additional tokens to vocabulary #334

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions