Add Jina Chinese Embedding model #2

JoanFM · 2024-04-30T12:31:07Z

No description provided.

…rompt (ggml-org#7950) * SimpleChat: Allow for chat req bool options to be user controlled * SimpleChat: Allow user to control cache_prompt flag in request * SimpleChat: Add sample GUI images to readme file Show the chat screen and the settings screen * SimpleChat:Readme: Add quickstart block, title to image, cleanup * SimpleChat: RePosition contents of the Info and Settings UI Make it more logically structured and flow through. * SimpleChat: Rename to apiRequestOptions from chatRequestOptions So that it is not wrongly assumed that these request options are used only for chat/completions endpoint. Rather these are used for both the end points, so rename to match semantic better. * SimpleChat: Update image included with readme wrt settings ui * SimpleChat:ReadMe: Switch to webp screen image to reduce size

* add chat template support for llama-cli * add help message * server: simplify format_chat * more consistent naming * improve * add llama_chat_format_example * fix server * code style * code style * Update examples/main/main.cpp Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

) * remove completions file * fix inverted vector * add mean method * code style * remove inverted pca hotfix

…ggml-org#8054) * gguf-dump: add --data-offset * gguf-dump: add tensor data offset table * gguf-dump: refactor GGUFReader for clarity * gguf-dump: add --data-alignment * gguf-dump.py: Rename variables and adjust comments start_data_offset --> data_offset _build_tensors_info_fields --> _build_tensor_info

* added healthcheck * added healthcheck * added healthcheck * added healthcheck * added healthcheck * moved curl to base * moved curl to base

…Maximum (ggml-org#7797) * json: support minimum for positive integer values * json: fix min 0 * json: min + max integer constraints * json: handle negative min / max integer bounds * json: fix missing paren min/max bug * json: proper paren fix * json: integration test for schemas * json: fix bounds tests * Update json-schema-to-grammar.cpp * json: fix negative max * json: fix negative min (w/ more than 1 digit) * Update test-grammar-integration.cpp * json: nit: move string rules together * json: port min/max integer support to Python & JS * nit: move + rename _build_min_max_int * fix min in [1, 9] * Update test-grammar-integration.cpp * add C++11-compatible replacement for std::string_view * add min/max constrained int field to pydantic json schema example * fix merge * json: add integration tests for min/max bounds * reshuffle/merge min/max integ test cases * nits / cleanups * defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)

* llama : fix codeshell support * llama : move codeshell after smollm below to respect the enum order

* contrib : clarify PR squashing * contrib : fix typo + add list of modules

The check gating the use of `__builtin_amdgc_sdot4` specifically checks for gfx1030. This causes a severe perf regression for anything gfx103? that's not gfx1030 and not using `HSA_OVERRIDE_GFX_VERSION` (if you've built ROCm to support it). We already have a generic RDNA2 define, let's use it.

* Fix Vulkan matmul tests compile errors * Add Vulkan IQ4_NL support * Fix Vulkan DeepSeek-Coder-V2-Lite MoE support

…g#8508) * llama : move sampling code into llama-sampling ggml-ci * llama : move grammar code into llama-grammar ggml-ci * cont ggml-ci * cont : pre-fetch rules * cont ggml-ci * llama : deprecate llama_sample_grammar * llama : move tokenizers into llama-vocab ggml-ci * make : update llama.cpp deps [no ci] * llama : redirect external API to internal APIs ggml-ci * llama : suffix the internal APIs with "_impl" ggml-ci * llama : clean-up

* Update cmake to support nvidia hardware & open-source compiler --------- Signed-off-by: Joe Todd <[email protected]>

* fix export-lora example * add more logging * reject merging subset * better check * typo

* fix `llama_chat_format_single` for mistral * fix typo * use printf

Ensure SYCL CI builds both static & dynamic libs for testing purposes Signed-off-by: Joe Todd <[email protected]>

Added link to game I made that depends on llama

* use sliding window for phi3 * fix typo, "data_swa" -> "data" * [conver_hf_to_gguf.py] add phi3 sliding window

* docfix: imatrix readme, quantum models -> quantized models. * docfix: server readme: quantum models -> quantized models.

…8669) * examples : remove finetune and train-text-from-scratch * fix build * update help message * fix small typo for export-lora

--------- Signed-off-by: Chen Xi <[email protected]> Co-authored-by: Meng, Hengyu <[email protected]>

* Improvements for Windows with Snapdragon X * Revert "Improvements for Windows with Snapdragon X" This reverts commit bf21397. * Improvements for Windows with Snapdragon X * WOA build clarifications * WIndows on ARM build clarifications * cmake build for Windows clarifications * Update docs/build.md Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: AndreasKunar <andreaskmsn.com> Co-authored-by: Georgi Gerganov <[email protected]>

ggml-ci

`ggml_init` can fail if no unused context is found. In that case, a NULL-pointer deref will happen later in the code during a call to `ggml_set_on_alloc`. This fixes it by bailing out if no context is found.

…rg#8687)

* server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)

…t-jina-embeddings-v2-zh

* vulkan : do not use tensor->extra This patch allows using the Vulkan backend with the RPC backend as tensor->extra is no longer used. Ref: ggml-org#8536 * Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (#2) --------- Co-authored-by: 0cc4m <[email protected]>

JoanFM force-pushed the feat-jina-embeddings branch from da96368 to d9b8dd6 Compare April 30, 2024 12:34

JoanFM force-pushed the feat-jina-embeddings-v2-zh branch from 72e93b2 to 3269efe Compare May 11, 2024 09:53

JoanFM changed the title ~~Feat jina embeddings v2 zh~~ Add Jina Chinese Embedding model May 13, 2024

JoanFM changed the base branch from feat-jina-embeddings to master May 13, 2024 07:46

JoanFM force-pushed the feat-jina-embeddings-v2-zh branch 2 times, most recently from fa527a5 to ea0f7df Compare May 13, 2024 08:31

JoanFM changed the base branch from master to feat-jina-v2-base-code June 5, 2024 14:31

github-actions bot added the python label Jun 5, 2024

JoanFM force-pushed the feat-jina-embeddings-v2-zh branch 2 times, most recently from 4dc0fe9 to 605a619 Compare June 6, 2024 08:31

github-actions bot added Nvidia GPU build devops ggml labels Jun 6, 2024

github-actions bot added documentation Improvements or additions to documentation Kompute SYCL Vulkan testing examples script server nix labels Jun 18, 2024

hanishkvc and others added 7 commits June 25, 2024 21:27

cvector: better prompt handling, add "mean vector" method (ggml-org#8069

49c03c7

) * remove completions file * fix inverted vector * add mean method * code style * remove inverted pca hotfix

Add healthchecks to llama-server containers (ggml-org#8081)

925c309

* added healthcheck * added healthcheck * added healthcheck * added healthcheck * added healthcheck * moved curl to base * moved curl to base

disable docker CI on pull requests (ggml-org#8110)

dd047b4

hankeke303 and others added 26 commits July 22, 2024 19:43

llama : fix codeshell support (ggml-org#8599)

081fe43

* llama : fix codeshell support * llama : move codeshell after smollm below to respect the enum order

[SYCL] fix scratch size of softmax (ggml-org#8642)

063d99a

contrib : clarify PR squashing + module names (ggml-org#8630)

e7e6487

* contrib : clarify PR squashing * contrib : fix typo + add list of modules

Vulkan IQ4_NL Support (ggml-org#8613)

751fcfc

* Fix Vulkan matmul tests compile errors * Add Vulkan IQ4_NL support * Fix Vulkan DeepSeek-Coder-V2-Lite MoE support

sycl : Add support for non-release DPC++ & oneMKL (ggml-org#8644)

64cf50a

* Update cmake to support nvidia hardware & open-source compiler --------- Signed-off-by: Joe Todd <[email protected]>

server : fix URL.parse in the UI (ggml-org#8646)

b841d07

examples : Fix llama-export-lora example (ggml-org#8607)

de28008

* fix export-lora example * add more logging * reject merging subset * better check * typo

add llama_lora_adapter_clear (ggml-org#8653)

b115105

Re-add erroneously removed -fsycl from GGML_EXTRA_LIBS (ggml-org#8667)

79167d9

llama : fix llama_chat_format_single for mistral (ggml-org#8657)

96952e7

* fix `llama_chat_format_single` for mistral * fix typo * use printf

readme : update UI list [no ci] (ggml-org#8505)

3a7ac53

Build Llama SYCL Intel with static libs (ggml-org#8668)

f19bf99

Ensure SYCL CI builds both static & dynamic libs for testing purposes Signed-off-by: Joe Todd <[email protected]>

readme : update games list (ggml-org#8673)

68504f0

Added link to game I made that depends on llama

llama: use sliding window for phi3 (ggml-org#8627)

8a4bad5

* use sliding window for phi3 * fix typo, "data_swa" -> "data" * [conver_hf_to_gguf.py] add phi3 sliding window

docs : Quantum -> Quantized (ggml-org#8666)

4b0eff3

* docfix: imatrix readme, quantum models -> quantized models. * docfix: server readme: quantum models -> quantized models.

examples : remove finetune and train-text-from-scratch (ggml-org#…

be6d7c0

…8669) * examples : remove finetune and train-text-from-scratch * fix build * update help message * fix small typo for export-lora

ggml : add and use ggml_cpu_has_llamafile() (ggml-org#8664)

eddcb52

[SYCL] fix multi-gpu issue on sycl (ggml-org#8554)

ed67bcb

--------- Signed-off-by: Chen Xi <[email protected]> Co-authored-by: Meng, Hengyu <[email protected]>

tests : fix printfs (ggml-org#8068)

88954f7

llama : fix build + fix fabs compile warnings (ggml-org#8683)

4226a8d

ggml-ci

ggml: handle ggml_init failure to fix NULL pointer deref (ggml-org#8692)

49ce0ab

`ggml_init` can fail if no unused context is found. In that case, a NULL-pointer deref will happen later in the code during a call to `ggml_set_on_alloc`. This fixes it by bailing out if no context is found.

examples : export-lora : fix issue with quantized base models (ggml-o…

41cd47c

…rg#8687)

server : add Speech Recognition & Synthesis to UI (ggml-org#8679)

01aec4a

* server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)

JoanFM force-pushed the feat-jina-embeddings-v2-zh branch from 2cfcbbe to e2a91ef Compare July 26, 2024 07:19

Merge branch 'master' of https://github.com/JoanFM/llama.cpp into fea…

201559d

…t-jina-embeddings-v2-zh

JoanFM force-pushed the feat-jina-embeddings-v2-zh branch from e2a91ef to 201559d Compare July 26, 2024 07:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Jina Chinese Embedding model #2

Add Jina Chinese Embedding model #2

Uh oh!

JoanFM commented Apr 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

91 participants

Add Jina Chinese Embedding model #2

Are you sure you want to change the base?

Add Jina Chinese Embedding model #2

Uh oh!

Conversation

JoanFM commented Apr 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

91 participants