πŸ“‚ Solvrays Llm - High Precision Document Analyst

\n## 🌟 Overview This model is a specialized fine-tuning of google/gemma-2b-it, engineered for Zero-Hallucination Document Retrieval. It has been optimized to handle complex, domain-specific documents (Technical, Legal, or Architectural) with strict adherence to provided context. \n### πŸ›  Primary Design Objectives

  • Factual Integrity: Programmed to prioritize 'Not Documented' over speculating.
  • Contextual Continuity: Overlap-aware training prevents information loss across page boundaries.
  • Domain Versatility: Seamlessly switches between technical and non-technical document styles. \n## πŸ’» Professional Usage (Grounded Inference)

To achieve the trained precision level, utilize the following code implementation: \n```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch

model_id = 'solvrays/solvrays-llm' tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=torch.bfloat16)

Universal Grounding Template

instruction = 'Analyze your internal knowledge base and provide a precise, factual response based strictly on the documentation you have been trained on. If the information is not documented, state that it is not documented.' query = 'What are the main infrastructure requirements?'

prompt = (f'### Instruction: {instruction}\n' f'### Knowledge Context: {query}\n' f'### Verified Response:')

inputs = tokenizer(prompt, return_tensors='pt').to(model.device) with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)

print(tokenizer.decode(outputs[0], skip_special_tokens=True).split('### Verified Response:')[-1].strip())

\n## πŸ“Š Technical Specifications
| Parameter | Configuration |
| :--- | :--- |
| Base Model | google/gemma-2b-it |
| Fine-tuning Method | QLoRA (4-bit quantization) |
| LoRA Rank (r) | 16 |
| LoRA Alpha | 32 |
| Training Epochs | 5 |
| Context Strategy | 512 tokens with 128-token overlap |
\n## ⚠️ Risks & Limitations
- **Context Window**: Strictly limited to the fine-tuned block size (512 tokens). For longer multi-page queries, RAG (Retrieval Augmented Generation) is recommended.
- **Bias**: The model reflects the biases of the provided training documentation.
- **Accuracy**: Always verify critical technical numbers against the original source.
\n--- 
**Architected and Fine-tuned by Bibek Lama Singtan**
Downloads last month
445
Safetensors
Model size
3B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for solvrays/solvrays-llm

Finetuned
(116)
this model