You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

TensorMind (0.5B)

TensorMind is a 536.9M-parameter causal language model for lightweight Chinese/English text generation.

Model Details

  • Architecture: Decoder-only Transformer (TensorMindForCausalLM)
  • Layers: 32
  • Hidden size: 1024
  • Heads / KV heads: 16 / 8 (GQA)
  • Context length: 32,768
  • Vocab size: 32,768
  • Positional encoding: RoPE
  • Activation: SiLU
  • Parameters: 536,941,568 (~0.5B)

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "TensorMind/TensorMind"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
)

prompt = "请用三句话介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Benchmark Snapshot

Evaluation time: 2026-03-07 00:40 (UTC+8), zero-shot (n-shot=0).

Model Params C-Eval CMMLU A-CLUE TMMLU+ AGIEval
TensorMind 0.5B 27.27 25.26 25.43 24.96 33.56

TensorMind benchmark table

TensorMind benchmark radar

Intended Use

  • Lightweight chat and text generation
  • Local experimentation and teaching
  • Baseline model for research and fine-tuning

Limitations

  • This is a small model and can produce factual errors.
  • Benchmark numbers above are from multiple-choice style evaluations and do not fully represent open-ended generation quality.
  • Outputs may contain bias or unsafe content; apply filtering for production use.

License

MIT License.

Downloads last month
15
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results