LLM benchmarks

A sample Python app for investigating the LLM benchmarks dataset from Kaggle. This can be used for some fun demos of Pieces for Developers.

Pre-requisites

Python
Poetry for managing dependencies.
Either VS Code or PyCharm.
Pieces for Developers, along with the relevant browser extension.

After cloning this repo, run poetry install to install the dependencies.

Run the code

To run the code, use the following command:

poetry run python app.py

Back story

You are a developer picking up this code for the first time. This is some data science code to look at LLM benchmarks, and your first task is to sort the data and plot it.

Demo 1 - copilot in your IDE

Imagine you have been given some code to work on, and you need to make changes.

From the IDE, ask the copilot to explain the code using the code lens
Add comments using the code lens

Demo 2 - snippets

This code would be better if the data frame was sorted by score, so that we can see the best performing models first. Let's research this in the browser, and when we find the relevant code, add it to Pieces.

You can see some code to do this at stackoverflow.com/questions/37787698/how-to-sort-pandas-dataframe-by-one-column. Open this in your browser.
Scroll to the second or third answer that just has the code to do this, and ask the copilot to explain.
Add the snippet to Pieces using the browser extension.
See this snippet in the Desktop app, and IDE with all the augmentation.
Use the snippet in your IDE to sort the data frame.

Demo 3 - snippets from images

As an additional demo, there is a screenshot of code to add a bar chart in the snippets folder.

Drag the image into Pieces desktop and show the detected code and annotations

Demo 4 - live context

There is a PR open for this repo to add plotting of a bar chart. This is a great use case for live context.

Open the PR in your browser and show the code changes.
From the copilot, start a new conversation with live context.
Ask the copilot to explain the PR. This prompt works: 'What problems are there in the github pull request I was just looking at?'
Show the output.

Demo 5 - more advanced live context

For a more detailed PR:

Open github.com/pieces-app/documentation/pull/486
Read the PR
Ask the copilot to summarize what Mason asked for - 'what changes did mason request in the PR I was just looking at?'

Demo 6 - errors and live context

Pieces can help with errors as well:

Add the following code to the end of the app.py file to get an error:

# Find the worst performing LLMs
worst_df = df.sort_values('tokens/s', ascending=True).head(10)
print(worst_df)

Run the code and look at the error in the terminal
Start a new copilot chat and turn on live context.
Ask the following question 'tell me about the keyerror I just got in vs code'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLM benchmarks

Pre-requisites

Run the code

Back story

Demo 1 - copilot in your IDE

Demo 2 - snippets

Demo 3 - snippets from images

Demo 4 - live context

Demo 5 - more advanced live context

Demo 6 - errors and live context

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLM benchmarks

Pre-requisites

Run the code

Back story

Demo 1 - copilot in your IDE

Demo 2 - snippets

Demo 3 - snippets from images

Demo 4 - live context

Demo 5 - more advanced live context

Demo 6 - errors and live context