Spaces:

YZ-TAN
/

flask-llama

Build error

Upload 2821 files

5a29263 verified 10 months ago

1.6 kB

	# llama.cpp/example/run

	The purpose of this example is to demonstrate a minimal usage of llama.cpp for running models.

	```bash
	llama-run granite3-moe
	```

	```bash
	Description:
	Runs a llm

	Usage:
	llama-run [options] model [prompt]

	Options:
	-c, --context-size <value>
	Context size (default: 2048)
	-n, -ngl, --ngl <value>
	Number of GPU layers (default: 0)
	--temp <value>
	Temperature (default: 0.8)
	-v, --verbose, --log-verbose
	Set verbosity level to infinity (i.e. log all messages, useful for debugging)
	-h, --help
	Show help message

	Commands:
	model
	Model is a string with an optional prefix of
	huggingface:// (hf://), ollama://, https:// or file://.
	If no protocol is specified and a file exists in the specified
	path, file:// is assumed, otherwise if a file does not exist in
	the specified path, ollama:// is assumed. Models that are being
	pulled are downloaded with .partial extension while being
	downloaded and then renamed as the file without the .partial
	extension when complete.

	Examples:
	llama-run llama3
	llama-run ollama://granite-code
	llama-run ollama://smollm:135m
	llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf
	llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf
	llama-run https://example.com/some-file1.gguf
	llama-run some-file2.gguf
	llama-run file://some-file3.gguf
	llama-run --ngl 999 some-file4.gguf
	llama-run --ngl 999 some-file5.gguf Hello World
	```