Phind-70B

Phind-70B is a fine-tuned version of Llama 3.3 70B Instruct, optimized for code generation, technical reasoning, and general instruction following.

Model Details

Attribute Details
Base Model meta-llama/Llama-3.3-70B-Instruct
Model Type Causal Language Model
Parameters 70 Billion
Context Length 128K tokens
Language English
License Llama 3.3 Community License

Intended Use

Phind-70B is designed for:

  • Code generation across multiple programming languages
  • Technical problem-solving and debugging
  • General instruction following and reasoning tasks
  • Multi-turn conversations requiring context retention

How to Use

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Phind/Phind-70B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Phind, an intelligent assistant that helps with programming and technical questions."},
    {"role": "user", "content": "Write a Python function to find the longest palindromic substring."},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Chat Template

This model uses the Llama 3 chat format:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_message}<|eot_id|><|start_header_id|>user<|end_header_id|>

{user_message}<|eot_id|><|start_header_id|>assistant<|end_header_id|}

{assistant_response}<|eot_id|>

Hardware Requirements

Precision VRAM Required
FP16/BF16 ~140 GB
INT8 ~70 GB
INT4 ~35 GB

For inference, we recommend using multiple GPUs with tensor parallelism or quantized versions for consumer hardware.

Limitations

  • May occasionally generate incorrect or misleading information
  • Not suitable for production use without additional safety measures
  • Performance may vary on tasks outside the training distribution
  • Should not be used for generating harmful, illegal, or unethical content

Acknowledgments

This model builds upon the excellent work by Meta on the Llama 3.3 model family. We are grateful for their contributions to open-source AI.

Downloads last month
10
Safetensors
Model size
71B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Phind/Phind-70B

Finetuned
(238)
this model