fix oai-compat embedding API by ngxson · Pull Request #220 · ngxson/wllama

ngxson · 2026-05-13T15:28:39Z

Fix #219

Missing format_embeddings_response_oaicompatto convert llama.cpp-specific response to OAI-compat response

Summary by CodeRabbit

Bug Fixes
- Embedding responses now follow an OpenAI-compatible shape (embedding vectors under data[0].embedding).
Examples & Tests
- Updated examples and tests to parse embeddings from the new response shape and validate similarity checks.
Chores
- Package version bumped.
- Default CDN WebAssembly URL and native backend reference updated.

coderabbitai · 2026-05-13T15:28:54Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 094fd17c-1df3-404f-986c-d88febb00397

📥 Commits

Reviewing files that changed from the base of the PR and between 4fed0d9 and e086d89.

⛔ Files ignored due to path filters (1)

src/wasm/wllama.wasm is excluded by !**/*.wasm

📒 Files selected for processing (2)

cpp/wllama-context.h
src/wllama.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/wllama.test.ts
cpp/wllama-context.h

📝 Walkthrough

Walkthrough

C++ now returns typed server task results and formats embedding outputs into OpenAI-compatible responses; TypeScript uses GlueMsgEmbeddingRes for embeddings; examples and tests read vectors from response.data[0].embedding; package and generated version identifiers advanced.

Changes

Embedding Response Format Standardization

Layer / File(s)	Summary
C++ Result Pointer Conversion and Embedding Formatting `cpp/wllama-context.h`	`get_next_result()` now returns `server_task_result_ptr` and error flag; `action_get_result()` detects embedding results, uses `format_embeddings_response_oaicompat` with model metadata for embeddings, otherwise `result->to_json()`, and sets `res.data_json.value` (or `""`) and `res.is_error.value`.
TypeScript Embedding Type Correction `src/wllama.ts`	`GlueMsgEmbeddingRes` imported and used as the response type for `createEmbedding()` (calls `proxy.wllamaAction` with `GlueMsgEmbeddingRes`).
Example and Test Consumer Updates `examples/basic/index.html`, `examples/embeddings/index.html`, `src/wllama.test.ts`	Examples and tests updated to extract embedding vectors from `response.data[0].embedding` instead of `response.embedding`.
Version and Generated Artifact Updates `llama.cpp`, `package.json`, `src/wasm-from-cdn.ts`, `src/workers-code/generated.ts`	`llama.cpp` subproject commit advanced; `package.json` version bumped to `3.1.1`; CDN wasm URL updated to `@wllama/wllama@3.1.1`; `LIBLLAMA_VERSION` updated.

🎯 3 (Moderate) | ⏱️ ~20 minutes

ngxson/wllama#187: also updates the llama.cpp subproject commit reference and overlaps at the submodule sync point.
ngxson/wllama#194: touches LIBLLAMA_VERSION and the llama.cpp submodule pointer similar to this PR.

"A rabbit hums in code so neat,
Embeddings nested, tidy and sweet.
Types aligned, examples cheer,
Versions bumped — release is near.
Hop, test, and ship! 🐇✨"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix oai-compat embedding API' accurately and concisely describes the main objective: fixing the OpenAI-compatible embedding API response format.
Linked Issues check	✅ Passed	The pull request addresses issue `#219` by implementing OpenAI-compatible embedding response formatting in the C++ backend and updating all frontend code to parse the corrected response structure.
Out of Scope Changes check	✅ Passed	All changes are directly related to fixing the embedding API response format: C++ backend changes implement OAI-compatible formatting, TypeScript/HTML frontend changes consume the new response structure, and version bumps are appropriate for a bug fix.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch xsn/fix_oai_embd

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/wllama-context.h`:
- Around line 183-189: In get_next_result(), avoid calling result->is_error()
after moving result; compute the boolean before the move (e.g., bool is_err =
result->is_error()) and then return {std::move(result), is_err}; reference the
server_task_result_ptr variable result and the is_error() call so the check
happens prior to std::move.

In `@src/wllama.test.ts`:
- Around line 183-185: The test accidentally reads the second embedding from the
first response; update the extraction to use res2 instead of res when building
embedding2 so the comparison uses two separate calls. Locate the second call to
wllama.createEmbedding (res2) and change the line that sets embedding2 to read
from res2.data[0].embedding (keeping the cast to number[]), ensuring the dot
product uses embedding and embedding2 as intended.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fe04dd23-1921-45fe-ae66-01ff6bc653b4

📥 Commits

Reviewing files that changed from the base of the PR and between b19148a and cb50257.

⛔ Files ignored due to path filters (1)

src/wasm/wllama.wasm is excluded by !**/*.wasm

📒 Files selected for processing (5)

cpp/wllama-context.h
examples/basic/index.html
examples/embeddings/index.html
src/wllama.test.ts
src/wllama.ts

fix oai-compat embedding API

cb50257

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread cpp/wllama-context.h

Comment thread src/wllama.test.ts

ngxson added 2 commits May 13, 2026 17:38

bump to latest source code

4fed0d9

nits

e086d89

ngxson merged commit e923fba into master May 13, 2026
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix oai-compat embedding API#220

fix oai-compat embedding API#220
ngxson merged 3 commits into
masterfrom
xsn/fix_oai_embd

ngxson commented May 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ngxson commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ngxson commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading