Phind-70B

Phind-70B is a fine-tuned version of Llama 3.3 70B Instruct, optimized for code generation, technical reasoning, and general instruction following.

Model Details

Attribute	Details
Base Model	meta-llama/Llama-3.3-70B-Instruct
Model Type	Causal Language Model
Parameters	70 Billion
Context Length	128K tokens
Language	English
License	Llama 3.3 Community License

Intended Use

Phind-70B is designed for:

Code generation across multiple programming languages
Technical problem-solving and debugging
General instruction following and reasoning tasks
Multi-turn conversations requiring context retention

How to Use

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Phind/Phind-70B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Phind, an intelligent assistant that helps with programming and technical questions."},
    {"role": "user", "content": "Write a Python function to find the longest palindromic substring."},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Chat Template

This model uses the Llama 3 chat format:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_message}<|eot_id|><|start_header_id|>user<|end_header_id|>

{user_message}<|eot_id|><|start_header_id|>assistant<|end_header_id|}

{assistant_response}<|eot_id|>

Hardware Requirements

Precision	VRAM Required
FP16/BF16	~140 GB
INT8	~70 GB
INT4	~35 GB

For inference, we recommend using multiple GPUs with tensor parallelism or quantized versions for consumer hardware.

Limitations

May occasionally generate incorrect or misleading information
Not suitable for production use without additional safety measures
Performance may vary on tasks outside the training distribution
Should not be used for generating harmful, illegal, or unethical content

Acknowledgments

This model builds upon the excellent work by Meta on the Llama 3.3 model family. We are grateful for their contributions to open-source AI.

Downloads last month: 10

Safetensors

Model size

71B params

Tensor type

BF16

Model tree for Phind/Phind-70B

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.3-70B-Instruct

Finetuned

(238)

this model