How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ego-hf/CodeLlama-7b-Python:F32
# Run inference directly in the terminal:
llama-cli -hf ego-hf/CodeLlama-7b-Python:F32
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ego-hf/CodeLlama-7b-Python:F32
# Run inference directly in the terminal:
llama-cli -hf ego-hf/CodeLlama-7b-Python:F32
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ego-hf/CodeLlama-7b-Python:F32
# Run inference directly in the terminal:
./llama-cli -hf ego-hf/CodeLlama-7b-Python:F32
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ego-hf/CodeLlama-7b-Python:F32
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ego-hf/CodeLlama-7b-Python:F32
Use Docker
docker model run hf.co/ego-hf/CodeLlama-7b-Python:F32
Quick Links

Model Card: Meta CodeLlama-7b-Python gguf

Origin Meta model CodeLlama-7b-Python, code llama large language model coding, codellama converted into gguf format with llama.cpp

Licen: "Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved."

Policy

Run model

./main -m ggml-model-f32-00001-of-00010.gguf -p "def fibonacci("

Convert to gguf

python3 convert.py ../codellama/CodeLlama-7b-Python

Split Model

Original Meta CodeLlama-7b-Python model converted with python3 convert.py to gguf and CodeLlama-7b-Python/ggml-model-f32.gguf and splitted with gguf-split to smaller size chunks up to split-max-tensors 32.

python3 convert.py ../codellama/CodeLlama-7b-Python
./gguf-split --split --split-max-tensors 32 ./models/CodeLlama-7b-Python/ggml-model-f32.gguf ./models/CodeLlama-7b-Python/ggml-model-f32

Merge-back model use

./gguf-split --merge ggml-model-f32-00001-of-00010.gguf ggml-model-f32.gguf
Downloads last month
36
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ego-hf/CodeLlama-7b-Python

Quantized
(3)
this model