πŸ—οΈ Solvrays Llm Pdf (Grounded Version)

🌟 Overview

This is a specialized, fine-tuned version of Gemma 2B optimized for Ground-Truth Technical Retrieval. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation.

πŸ›  Key Capabilities

  • Zero-Hallucination Mode: Configured for deterministic greedy decoding.
  • Direct Provenance: Trained to recognize specific technical documents as "Ground Truth".
  • Infrastructure Focused: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components).
  • Merged Weights: Standalone weights for high-speed, native inference.

πŸ’» Grounded Quick Start (Precise Inference)

To get the most accurate, non-hallucinatory responses, use the following grounded prompt:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "singtan/solvrays-llm-pdf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)

# MUST use the Ground-Truth prompt template
prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=False,      # Force deterministic facts
        repetition_penalty=1.5
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ“Š Engineering Specifications

  • Strategy: SFT (Supervised Fine-Tuning) with Grounding Headers.
  • Rank (r): 16 (High capacity for technical fact retention).
  • Epochs: 5 (Heavy Fact Reinforcement).
  • Context Window: 512 with 128-token overlap for fact-continuity.

⚠️ Usage Recommendations

For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus.


Scientifically Fine-tuned by Bibek Lama Singtan

Downloads last month
352
Safetensors
Model size
3B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for singtan/solvrays-llm-pdf

Base model

google/gemma-2b
Finetuned
(287)
this model