πŸ“‚ Solvrays Finetuned Pdf - Document AI

🌟 Model Overview

This model is a high-precision fine-tuning of google/gemma-2b-it, specifically architected for Zero-Hallucination Technical Retrieval. It has been trained on a proprietary dataset of technical and architectural documentation to ensure deep contextual grounding.

πŸš€ Key Capabilities

  • Technical Grounding: Prioritizes factual documentation over generative speculation.
  • Chunk-Aware Memory: Optimized for overlapping document segments (256-token window).
  • Deterministic Precision: Best used with do_sample=False for architectural accuracy.

πŸ’» Professional Implementation

The model requires specific prompt construction to trigger its 'Knowledge Retrieval' mode:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = 'solvrays/solvrays-finetuned-pdf'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map='auto', 
    torch_dtype=torch.bfloat16, 
    quantization_config={'load_in_4bit': True}
)

def query_model(user_query):
    # High-Precision Retrieval Template
    prompt = f'### Knowledge Retrieval Content: {user_query}\n### Verified Response: '
    inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
    outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
    return tokenizer.decode(outputs[0], skip_special_tokens=True).split('### Verified Response:')[-1].strip()

πŸ“Š Technical Specifications

Feature Configuration
Base Model google/gemma-2b-it
Precision BrainFloat16 (BF16)
Fine-tuning QLoRA (4-bit Normalized Float)
LoRA Rank (r) 16
LoRA Alpha 32
Target Modules q, k, v, o, gate, up, down
Training Epochs 25

πŸ›  Training Environment

  • Hardware: NVIDIA L4 x 2 (Dual GPU Architecture)
  • Optimizer: Paged AdamW 8-bit
  • Context Length: 256 tokens per block

⚠️ Constraints & Risk Mitigation

  • Out-of-Scope: This model is not intended for general conversation or creative writing. It is a specialized document analyst.
  • Hallucination Control: If information is not present in the internal weights, the model is trained to state 'Not Documented' or provide an empty response for verification.
  • Numerical Accuracy: Always cross-verify critical measurements with original PDF source material.

Senior AI Architect & Developer: Solvrays

Downloads last month
460
Safetensors
Model size
3B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for solvrays/solvrays-finetuned-pdf

Finetuned
(116)
this model