python sentence transformer all-MiniLM-L6-v2 is almost 2x faster than candle

Here is my candle implementation: (Taken from the examples itself)

`pub fn encode(&self, prompt: &str) -> Result<(Tensor,Tensor)> {

        let tokens = self.tokenizer
            .encode(prompt, true)
            .map_err(E::msg)?
            .get_ids()
            .to_vec();
        let token_ids = Tensor::new(&tokens[..],&self.device )?.unsqueeze(0)?;
        let token_type_ids = token_ids.zeros_like()?;
        let embeddings =self.model.forward(&token_ids, &token_type_ids)?;
        Ok((embeddings,token_ids))
    }`

and here is the python implementation

`
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode([request.query])
`

On comparing just the encoding times, this is what I get

Encoding time for rust:

```
Time taken for encoding: 57.054333ms
Time taken for encoding: 59.913916ms
Time taken for encoding: 55.118625ms
Time taken for encoding: 51.580917ms
Time taken for encoding: 60.823625ms
Time taken for encoding: 56.318333ms
Time taken for encoding: 52.357875ms
Time taken for encoding: 82.0645ms
Time taken for encoding: 52.349709ms
Time taken for encoding: 63.768209ms
Time taken for encoding: 55.508666ms
```

Encoding time for python:

```
Time taken for encoding: 33.95 ms
Time taken for encoding: 124.68 ms
Time taken for encoding: 54.5 ms
Time taken for encoding: 30.46 ms
Time taken for encoding: 20.73 ms
Time taken for encoding: 26.07 ms
Time taken for encoding: 37.49 ms
Time taken for encoding: 24.42 ms
Time taken for encoding: 36.08 ms
Time taken for encoding: 24.55 ms
Time taken for encoding: 36.13 ms
Time taken for encoding: 29.97 ms
Time taken for encoding: 35.69 ms
Time taken for encoding: 26.8 ms
Time taken for encoding: 31.32 ms
Time taken for encoding: 30.12 ms
Time taken for encoding: 32.37 ms
Time taken for encoding: 34.27 ms
Time taken for encoding: 31.85 ms
Time taken for encoding: 35.78 ms
Time taken for encoding: 44.09 ms
Time taken for encoding: 19.15 ms
Time taken for encoding: 23.83 ms
Time taken for encoding: 33.09 ms
Time taken for encoding: 31.65 ms
```

Is it the issue of the implementation or am I doing something wrong here?

The experiment was run on an m1 mac air


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

python sentence transformer all-MiniLM-L6-v2 is almost 2x faster than candle #2418

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

python sentence transformer all-MiniLM-L6-v2 is almost 2x faster than candle #2418

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions