Instructions to use codellama/CodeLlama-70b-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use codellama/CodeLlama-70b-hf with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="codellama/CodeLlama-70b-hf")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-70b-hf")
model = AutoModelForMultimodalLM.from_pretrained("codellama/CodeLlama-70b-hf")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use codellama/CodeLlama-70b-hf with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "codellama/CodeLlama-70b-hf"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-70b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/codellama/CodeLlama-70b-hf

SGLang

How to use codellama/CodeLlama-70b-hf with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "codellama/CodeLlama-70b-hf" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-70b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "codellama/CodeLlama-70b-hf" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "codellama/CodeLlama-70b-hf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use codellama/CodeLlama-70b-hf with Docker Model Runner:
```
docker model run hf.co/codellama/CodeLlama-70b-hf
```

Provide prompt examples

by tangles - opened Jan 31, 2024

Discussion

tangles

Jan 31, 2024

please provide some prompt examples and formatting, stop tokens

Eric1104

Jan 31, 2024

请帮我写一段python 读取excel 程序 excel 在某个目录下子目录下面也有文件 xlsx 请用pyopenxl

whoami02

Jan 31, 2024

•

edited Jan 31, 2024

@Eric1104 use pandas, or langchain.document_loaders.unstructured or llamaIndex. Tons of options available

nevzata

Feb 1, 2024

•

edited Feb 1, 2024

The code output of Llama based models screw up Python indentation so bad that the code neither works nor can be fixed by auto formatters. Only manual fix can make the code work again. Anyone else noticed this?
Take a look at this simple python code it generated yesterday, The lines after "def:", "except:" and last "if:" have only 1 space characer. Also, "if:" and "elif": have different margins, all of these make the code buggy and unfixable. There are cases with 1,2,3,4 spaces!

import sys,base64
def main():
 try:
     if len(sys.argv)>=3:
         opcode = str(sys.argv[1]) #operation code
         data = str(sys.argv[2]) #data
         
         if opcode == "enc":
             encoded_string = base64.b64encode(bytes(data,"utf8"))
             result = f"Encoded String:\n{encoded_string}"
             
         elif opcode == "dec":
            decoded_string = base64.b64decode(str(data))
            result = f"Decoded String:\n {decoded_string}"
        else:
           raise Exception("Invalid Operation Code")
      except IndexError:
       print('Please provide two arguments')
      except ValueError:
       print('Please enter valid input')
      except Exception as e:
       print(f'An error occurred: {e}')
if __name__== '__main__':
 main()

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment