Skip to content

[Bug Report] Padding Tokens not masked during HookedTransformer.generate() #1005

@tuomaso

Description

@tuomaso

Describe the bug
Currently HookedTransformer.generate() does mask padding tokens when generating a batch of inputs, causing generating for a batch of prompts to give different results from generating from each prompt one-by-one.

Code example

model = HookedTransformer.from_pretrained("gpt2")
input_prompts = ["Hello, my dog is cute", "This is a much longer text. Hello, my cat is cute"]
orig_outputs = []
for prompt in input_prompts:
    out = model.generate(prompt, verbose=False, do_sample=False)
    orig_outputs.append(out)
    
batched_outputs = model.generate(input_prompts, verbose=False, do_sample=False)
for i in range(len(orig_outputs)):
    assert orig_outputs[i] == batched_outputs[i]

In this example the output for the shorter prompt changes when batched, while the longer prompt return the same output. This is because attention to padding tokens is not masked. After adding masking for the padding according to the PR the shorter sentence also returns the same output when batching.

System Info
Using transformer_lens 4.55.0(installed from pip) on a Linux environment, python 3.10.18

Additional context

I have added a test case and a fix for generating with padding="left" on the linked PR, but I am not sure how this can be fixed for right padding.

Checklist

  • [ x] I have checked that there is no similar issue in the repo (required)

#999

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions