Skip to content

Patch for Gaudi Text-Generation Pipeline#690

Merged
regisss merged 5 commits into
huggingface:mainfrom
sjagtap1803:textgen_pipeline_patch
Feb 7, 2024
Merged

Patch for Gaudi Text-Generation Pipeline#690
regisss merged 5 commits into
huggingface:mainfrom
sjagtap1803:textgen_pipeline_patch

Conversation

@sjagtap1803
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR includes some minor changes to the text-generation pipeline code for langchain==0.0.191 compatibility.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@sjagtap1803 sjagtap1803 requested a review from regisss as a code owner February 6, 2024 04:35
@sjagtap1803
Copy link
Copy Markdown
Contributor Author

sjagtap1803 commented Feb 6, 2024

Hi @regisss. As per our conversation this morning, I have changed the output format of the pipeline class in order to make it work with langchain==0.0.191.

I would appreciate it if you could try running the run_pipeline.py script and let me know if you face any issues.

Will work on adding some langchain examples to the blog next if these changes look good. Thanks!

Copy link
Copy Markdown
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can test the type of the output and keep the former way if it is a string (I guess it is, maybe the type is different) else return the new format. WDYT?

@sjagtap1803
Copy link
Copy Markdown
Contributor Author

Maybe we can test the type of the output and keep the former way if it is a string (I guess it is, maybe the type is different) else return the new format. WDYT?

I tried the former output type (string) but faced some issues with langchain==0.0.191. It looks like langchain expects the output to be a dictionary wrapped in a list as shown here: https://github.com/langchain-ai/langchain/blob/b3ae6bcd3f42ec85ee65eb29c922ab22a17a0210/langchain/llms/huggingface_pipeline.py#L169

@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Feb 6, 2024

Yep I understand. What I propose is to have something like:

if isinstance(output, str):
    # return the same as before
else:
    # return LangChain format

Would that work for you?

@sjagtap1803
Copy link
Copy Markdown
Contributor Author

sjagtap1803 commented Feb 6, 2024

Yep I understand. What I propose is to have something like:

if isinstance(output, str):
    # return the same as before
else:
    # return LangChain format

Would that work for you?

Where do you suggest adding this check?

One way would be adding an input argument to the pipeline constructor (use_with_langchain=False) and returning output in langchain format if it's True:

class GaudiTextGenerationPipeline(TextGenerationPipeline):
    def __init__(self, args, logger, use_with_langchain=False):
        self.model, self.tokenizer, self.generation_config = initialize_model(args, logger)

        self.task = "text-generation"
        self.device = args.device

        if args.do_sample:
            self.generation_config.temperature = args.temperature
            self.generation_config.top_p = args.top_p

        self.max_padding_length = args.max_input_tokens if args.max_input_tokens > 0 else 100
        self.use_hpu_graphs = args.use_hpu_graphs
        self.profiling_steps = args.profiling_steps
        self.profiling_warmup_steps = args.profiling_warmup_steps

        self.use_with_langchain = use_with_langchain
        if self.use_with_langchain:
            self.generation_config.ignore_eos = False

        import habana_frameworks.torch.hpu as torch_hpu

        logger.info("Graph compilation...")
        for _ in range(3):
            self("Here is my prompt")
        torch_hpu.synchronize()

    def __call__(self, prompt: str):
        model_inputs = self.tokenizer.encode_plus(
            prompt, return_tensors="pt", max_length=self.max_padding_length, padding="max_length", truncation=True
        )

        for t in model_inputs:
            if torch.is_tensor(model_inputs[t]):
                model_inputs[t] = model_inputs[t].to(self.device)

        output = self.model.generate(
            **model_inputs,
            generation_config=self.generation_config,
            lazy_mode=True,
            hpu_graphs=self.use_hpu_graphs,
            profiling_steps=self.profiling_steps,
            profiling_warmup_steps=self.profiling_warmup_steps,
        ).cpu()

        output_text = self.tokenizer.decode(output[0], skip_special_tokens=True)

        if self.use_with_langchain:
            return [{"generated_text": output_text}]
        
        return output_text

WDYT?

@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Feb 6, 2024

@sjagtap1803 That looks good to me 👍

Copy link
Copy Markdown
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also run the following from the root of the repo please?

pip install -U ruff
make style

Besides, do you already know which version of LangChain should be used? We should specify it in the README.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sjagtap1803
Copy link
Copy Markdown
Contributor Author

I applied code formatting by running make style.

Regarding the LangChain version, I will test a few examples with 0.0.191 later today and update the blog accordingly. If the examples run as expected, I will specify the version in the README.

@regisss regisss added the run-test Run CI for PRs from external contributors label Feb 6, 2024
@regisss regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Feb 7, 2024
@regisss regisss merged commit 8841452 into huggingface:main Feb 7, 2024
jychen21 pushed a commit to jychen21/optimum-habana that referenced this pull request Feb 27, 2024
dudilester pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Feb 28, 2024
HolyFalafel pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-test Run CI for PRs from external contributors

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants