| --- |
| library_name: transformers |
| tags: |
| - math |
| - cot |
| - text-generation-inference |
| - preview |
| - experimental |
| license: apache-2.0 |
| language: |
| - en |
| base_model: |
| - Qwen/Qwen2.5-1.5B-Instruct |
| pipeline_tag: text-generation |
| --- |
| |
|  |
|
|
| # **Deepmath-Competitive-1.5B-Preview** |
|
|
| > **Deepmath-Competitive-1.5B-Preview** is a **chain-of-thought reasoning model** fine-tuned from **Qwen-1.5B**, purpose-built for solving **mathematical problems** in both **English** and **Chinese** with a focus on **long-context understanding**. It enables advanced reasoning and detailed step-by-step problem solving in a compact form — ideal for competitive exam preparation, tutoring systems, and math-focused AI assistants. |
|
|
| ## **Key Features** |
|
|
| 1. **Chain-of-Thought Math Reasoning** |
| Specifically trained to output detailed intermediate steps for math problems, Deepmath-Competitive-1.5B-Preview ensures interpretability and logical clarity — vital for learning and validation. |
|
|
| 2. **Bilingual Proficiency (English + Chinese)** |
| Proficient in understanding and solving math problems in **both English and Simplified Chinese**, supporting diverse educational needs. |
|
|
| 3. **Long-Context Reasoning** |
| Optimized for **long-form math problems** and word problem comprehension, enabling reasoning over extended contexts and compound queries. |
|
|
| 4. **Compact yet Powerful** |
| With just 1.5B parameters, it delivers robust performance on arithmetic, algebra, geometry, logic, and competitive exam-style word problems with minimal computational cost. |
|
|
| 5. **Structured Step-by-Step Computation** |
| Produces clean, stepwise outputs that mimic expert human problem-solving, helping learners follow the process and logic intuitively. |
|
|
| ## **Quickstart with Transformers** |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model_name = "prithivMLmods/Deepmath-Competitive-1.5B-Preview" |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| model_name, |
| torch_dtype="auto", |
| device_map="auto" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained(model_name) |
| |
| prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?" |
| messages = [ |
| {"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."}, |
| {"role": "user", "content": prompt} |
| ] |
| text = tokenizer.apply_chat_template( |
| messages, |
| tokenize=False, |
| add_generation_prompt=True |
| ) |
| model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
| |
| generated_ids = model.generate( |
| **model_inputs, |
| max_new_tokens=512 |
| ) |
| generated_ids = [ |
| output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) |
| ] |
| |
| response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
| ``` |
|
|
| ## **Intended Use** |
|
|
| - **Math Tutoring Bots**: Delivers in-depth, multi-step solutions for students preparing for competitive and school-level math. |
| - **Bilingual Educational Apps**: Effective in English and Chinese teaching environments. |
| - **STEM Reasoning Tools**: Supports structured reasoning across science and engineering questions. |
| - **Compact LLM Deployments**: Suitable for low-latency environments like mobile apps, edge devices, or web integrations. |
|
|
| ## **Limitations** |
|
|
| 1. **Domain Focus**: |
| Primarily tuned for mathematics; performance may drop outside STEM or logical domains. |
|
|
| 2. **Model Scale**: |
| While efficient, it may underperform on abstract or research-level problems compared to larger models. |
|
|
| 3. **Inherited Biases**: |
| As a fine-tune of Qwen-1.5B, some pretraining biases may persist. Review is advised in critical applications. |
|
|
| 4. **Prompt Sensitivity**: |
| Performs best with clearly structured prompts and formal question phrasing. |