-
Notifications
You must be signed in to change notification settings - Fork 932
Issues: abetlen/llama-cpp-python
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
#1749
opened Sep 19, 2024 by
1431551850
How get output from fine tuned llama3 model(trained with alpaca format dataset) in a json format ?
#1744
opened Sep 18, 2024 by
ApurvPujari
LLamaDiskCache: needs a RO / 'static' disk cache for RAG use cases
#1737
opened Sep 11, 2024 by
tc-wolf
[Draft Issue] system crash on exit (after inference is done)
#1735
opened Sep 11, 2024 by
Mrw33554432
2 of 4 tasks
Scores are stored in a 32-bit NumPy array even when K and V are quantized
#1732
opened Sep 5, 2024 by
EthanZoneCoding
4 tasks
GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 behavior is strange.
#1720
opened Aug 31, 2024 by
Enchante503
4 tasks
No matter how many times I build it, it won't start
bug
Something isn't working
build
#1719
opened Aug 31, 2024 by
Enchante503
3 of 4 tasks
Allow python packages to contribute to LlamaChatCompletionHandlerRegistry
#1715
opened Aug 29, 2024 by
axel7083
flash attention on Nvidia Tesla P100s results in the Something isn't working
build
CUDA error: unspecified launch failure
- (CUDA kernel flash_attn_tile_ext_f16 has no device code compatible with CUDA arch 520
)
bug
#1710
opened Aug 26, 2024 by
AlHering
4 tasks done
Empty output when running Q4_K_M quantization of Llama-3-8B-Instruct with llama-cpp-python
#1696
opened Aug 22, 2024 by
smolraccoon
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.