@@ -14,7 +14,7 @@ locally on most computers, with no installation.
1414## Quickstart
1515
1616The easiest way to try it for yourself is to download our example
17- llamafile for the [ LLaVA] ( https://llava-vl.github.io/ ) model (license: [ LLaMA] ( https://github. com/facebookresearch/ llama/blob/main/LICENSE ) ,
17+ llamafile for the [ LLaVA] ( https://llava-vl.github.io/ ) model (license: [ LLaMA 2 ] ( https://ai.meta. com/resources/models-and-libraries/ llama-downloads/ ) ,
1818[ OpenAI] ( https://openai.com/policies/terms-of-use ) ). LLaVA is a new LLM that can do more
1919than just chat; you can also upload images and ask it questions
2020about them. With llamafile, this all happens locally; no data
@@ -389,7 +389,7 @@ FLAGS
389389 -0 store uncompressed (currently default)
390390```
391391
392- ## Technical Details
392+ ## Technical details
393393
394394Here is a succinct overview of the tricks we used to create the fattest
395395executable format ever. The long story short is llamafile is a shell
@@ -402,7 +402,7 @@ The llama.cpp executable then opens the shell script again as a file,
402402and calls mmap() again to pull the weights into memory and make them
403403directly accessible to both the CPU and GPU.
404404
405- ### ZIP Weights Embedding
405+ ### ZIP weights embedding
406406
407407The trick to embedding weights inside llama.cpp executables is to ensure
408408the local file is aligned on a page size boundary. That way, assuming
@@ -415,7 +415,7 @@ program should be able to read them, provided they support ZIP64. This
415415makes the weights much more easily accessible than they otherwise would
416416have been, had we invented our own file format for concatenated files.
417417
418- ### Microarchitectural Portability
418+ ### Microarchitectural portability
419419
420420On Intel and AMD microprocessors, llama.cpp spends most of its time in
421421the matmul quants, which are usually written thrice for SSSE3, AVX, and
@@ -425,7 +425,7 @@ that can be `#include`ed multiple times, with varying
425425wrapper function is added which uses Cosmopolitan's ` X86_HAVE(FOO) `
426426feature to runtime dispatch to the appropriate implementation.
427427
428- ### Architecture Portability
428+ ### Architecture portability
429429
430430llamafile solves architecture portability by building llama.cpp twice:
431431once for AMD64 and again for ARM64. It then wraps them with a shell
@@ -441,7 +441,7 @@ program that comes included with the `cosmocc` compiler. What the
441441the host platform's native executable format. This guarantees a fallback
442442path exists for traditional release processes when it's needed.
443443
444- ### GPU Support
444+ ### GPU support
445445
446446Cosmopolitan Libc uses static linking, since that's the only way to get
447447the same executable to run on six OSes. This presents a challenge for
0 commit comments