How to replace GPT-3 with my own ChatGLM2-6B model？ #4

TuoniTuoni · 2023-07-24T12:47:02Z

How to replace GPT-3 with my own ChatGLM2-6B model？

TuoniTuoni · 2023-07-24T12:48:28Z

what are LLMQAModel and LMGenerator？What are they used for separately？

tusharkhot · 2023-08-02T19:55:17Z

Sorry I am seeing these issues late. The way to view these classes are:

LLMQAModel: A model that takes a question and generates an answer using a generator (which can be GPT3Generator or LLMClientGenerator)
LMGenerator is a class that could be used to load HF models like ChatGLM2. Unfortunately we stopped using this class and instead relied on a client-server approach. Basically you can use the LLMClientGenerator and point it to a LLMServer (https://github.com/HarshTrivedi/llm_server) This has two advantages: removes the dependency on HF for this code (which often has to be updated) and doesn't require each model to be loaded within DecomP.

So with regards to your question about using a different model, you have two choices:

Use the LLM server code to start a server with your preferred model and change your configs to point to this server. E.g.
```
  gen_params = {
    "gen_model": "llm_api", 
    "model_name": "<your model>", 
    "host": "<host>", 
     "port": <port>
  }
```
You may need to update the server code to handle newer models
Modify the DecomP code to use LMGenerator and also update the LMGenerator code to use the latest HF Transformers code needed for your model. This will load the model each time you run an experiment and would need a GPU machine for experiments.

Please let me know if either of these directions work for you. I can help you with the changes needed to support other models.

Provide feedback