Nx
Collection
Main series of models by GoofyLM. • 6 items • Updated
How to use GoofyLM/N1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="GoofyLM/N1") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("GoofyLM/N1")
model = AutoModelForCausalLM.from_pretrained("GoofyLM/N1")How to use GoofyLM/N1 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GoofyLM/N1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GoofyLM/N1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/GoofyLM/N1
How to use GoofyLM/N1 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "GoofyLM/N1" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GoofyLM/N1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "GoofyLM/N1" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GoofyLM/N1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use GoofyLM/N1 with Docker Model Runner:
docker model run hf.co/GoofyLM/N1
Banner by Croissant
N1 is a small, experimental Chain-of-Thought (COT) model based on the LLaMA architecture, developed by GoofyLM.
{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
You are a helpful AI assistant named N1, trained by GoofyLM<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}
This model is designed for text generation tasks with a focus on reasoning through problems step-by-step (using its Chain-of-Thought).
The model can be loaded using the following:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("GoofyLM/N1")
tokenizer = AutoTokenizer.from_pretrained("GoofyLM/N1")
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="GoofyLM/N1-quant",
filename="N1_Q8_0.gguf",
)
ollama run hf.co/GoofyLM/N1-quant:Q8_0