Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI cell generation #137

Merged
merged 10 commits into from
Jul 17, 2024
Merged

AI cell generation #137

merged 10 commits into from
Jul 17, 2024

Conversation

nichochar
Copy link
Contributor

Introduce the ability to create cells (code, markdown) with AI.

Below is a video of the MVP experience. A few things to note:

  • I built it so that it can create either markdown or code. There was little in the design that prevented this, and technically it was natural.
  • We need to figure out how to give more context & data when things go wrong. Our users are developers so they'll immediately want to see logs, that's at least what I want.

https://www.loom.com/share/91e4905a691f424db2791539ddfaf3e8?sid=683873e2-0fae-4f98-8019-063002ff3b47

@nichochar nichochar requested a review from benjreinhart July 16, 2024 23:41
Copy link
Contributor

@benjreinhart benjreinhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the prompts look too long and wordy to me. There seems to be a lot of unnecessary text in them (why do we need that lengthy example?).

autoFocus
onChange={(e) => setPrompt(e.target.value)}
placeholder="Write a prompt..."
className="flex min-h-[60px] w-full rounded-sm px-3 py-2 text-sm placeholder:text-muted-foreground focus-visible:outline-none pl-0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not codemirror here so we can get auto-resize? Otherwise, we should make this a bit bigger and add a resize-none

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not codemirror here so we can get auto-resize?

Codemirror felt heavyweight for what is essentially a simple multi-line input

we should make this a bit bigger and add a resize-none

I like the resize, it's not too visible, but gives the user control should they be pasting a bigger prompt (unlikely, but possible). Why don't you like it?

@@ -128,6 +128,25 @@ router.post('/generate', cors(), async (req, res) => {
}
});

// Generate a cell using AI from a query string
router.options('/sessions/:id/generate_cell', cors());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a websocket event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a super strong reason: I favor http when i need request/response semantics with a pending state. It's also less code.

Do you think this would be better as websocket?


You are tasked with generating a Srcbook code cell for the user.

A Srcbook is a JavaScript notebook following a markdown-compatible format called .srcmd. It's an interactive and rich way of programming that follows the literate programming idea.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interactive and rich way of programming that follows the literate programming idea.

These prompts feel too wordy IMO. This ^, for example, doesn't seem necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the prompts can definitely be iterated on. This made more sense in the generate srcbook one...I'll remove that particular sentence

Without evals, it's a bit of a nacked process. I am happy to remove this, but the next step for prompt iteration is to set up some basic evals. For this particular flow, the evals could programmatically test that we get valid srcbook code back based on a set of known user request examples.

Once we have that, we can start iterating with a bit more confidence.

@nichochar
Copy link
Contributor Author

IMO, the prompts look too long and wordy to me. There seems to be a lot of unnecessary text in them (why do we need that lengthy example?).

Do you have data that suggests that shorter prompts work better? I am happy to iterate on the prompts and shorten them, but I think that the example is one of the most valuable parts. LLMs do very well when you give them examples (few shot).

Are your concerns on cost? latency?
I'm meeting with the codestory people to talk about this stuff, but I don't have a lot of intuitions yet. We need to setup evals and experiment. This is a first pass which does quite well on my empirical QA.

@nichochar nichochar merged commit a865b7f into main Jul 17, 2024
1 check passed
@nichochar nichochar deleted the cell-generation-v2 branch July 17, 2024 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants