Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,35 +1,19 @@
|
|
| 1 |
---
|
| 2 |
{}
|
| 3 |
---
|
| 4 |
-
---
|
| 5 |
-
license: apache-2.0
|
| 6 |
-
library_name: transformers
|
| 7 |
-
base_model: google/gemma-2b
|
| 8 |
-
tags:
|
| 9 |
-
- text-generation
|
| 10 |
-
- fine-tuned
|
| 11 |
-
- pdf-grounded
|
| 12 |
-
- zero-hallucination
|
| 13 |
-
- domain-specific
|
| 14 |
-
language:
|
| 15 |
-
- en
|
| 16 |
-
pipeline_tag: text-generation
|
| 17 |
-
---
|
| 18 |
|
| 19 |
-
# π Solvrays Llm (
|
| 20 |
|
| 21 |
## π Overview
|
| 22 |
-
This
|
| 23 |
-
|
| 24 |
-
### π Core Advancements
|
| 25 |
-
- **Universal Document Grounding**: Optimized for both technical and non-technical corpora.
|
| 26 |
-
- **Negative Constraint Training**: Trained to prioritize "Not Documented" over guessing when information is missing.
|
| 27 |
-
- **Deterministic Inference**: Configured for greedy decoding to ensure factual consistency.
|
| 28 |
-
- **High Continuity**: Trained with 128-token chunk overlap to preserve context across page boundaries.
|
| 29 |
|
| 30 |
-
##
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
| 32 |
|
|
|
|
| 33 |
```python
|
| 34 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 35 |
import torch
|
|
@@ -38,37 +22,22 @@
|
|
| 38 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 39 |
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
|
| 40 |
|
| 41 |
-
# The EXACT prompt used during fine-tuning grounding
|
| 42 |
instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented."
|
| 43 |
-
document_source = "Your_Document_Name.pdf"
|
| 44 |
-
query = "Explain the key requirements from the document."
|
| 45 |
-
|
| 46 |
prompt = f"### Instruction: {instruction}
|
| 47 |
-
### Source:
|
| 48 |
-
### Content:
|
| 49 |
### Verified Response:"
|
| 50 |
-
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 51 |
-
|
| 52 |
-
with torch.no_grad():
|
| 53 |
-
outputs = model.generate(
|
| 54 |
-
**inputs,
|
| 55 |
-
max_new_tokens=256,
|
| 56 |
-
do_sample=False, # Absolute precision mode
|
| 57 |
-
repetition_penalty=1.5
|
| 58 |
-
)
|
| 59 |
|
|
|
|
|
|
|
| 60 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Verified Response:")[-1].strip())
|
| 61 |
```
|
| 62 |
|
| 63 |
-
## π Training
|
| 64 |
-
- **
|
| 65 |
-
- **
|
| 66 |
-
- **
|
| 67 |
-
- **Epochs**: 5 (Intensive
|
| 68 |
-
- **Hardware**: Optimized for NVIDIA L4/V100/A100 environments.
|
| 69 |
-
|
| 70 |
-
## π License & Usage
|
| 71 |
-
Usage is governed by the Apache-2.0 license and the Gemma Prohibited Use Policy. Ideal for internal technical summary, compliance checking, and document search tasks.
|
| 72 |
|
| 73 |
---
|
| 74 |
-
**
|
|
|
|
| 1 |
---
|
| 2 |
{}
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
+
# π Solvrays Llm (Ground-Truth Precise)
|
| 6 |
|
| 7 |
## π Overview
|
| 8 |
+
This is a specialized fine-tuned version of **Gemma 2B**, optimized for **High-Precision Document Retrieval**. It has been trained using strict grounding templates to ensure zero-hallucination and deterministic factual responses.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
+
## π Key Advanced Features
|
| 11 |
+
- **Zero-Hallucination Mode**: Deterministic greedy decoding by default.
|
| 12 |
+
- **Negative Constraint Awareness**: Trained to avoid guessing when information is missing.
|
| 13 |
+
- **Domain Agnostic**: Works for any technical or non-technical PDF provided as context.
|
| 14 |
+
- **Standalone Conversion**: Fully merged FP16 weights for production deployment.
|
| 15 |
|
| 16 |
+
## π» Quick Start (Inference)
|
| 17 |
```python
|
| 18 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 19 |
import torch
|
|
|
|
| 22 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 23 |
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
|
| 24 |
|
|
|
|
| 25 |
instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented."
|
|
|
|
|
|
|
|
|
|
| 26 |
prompt = f"### Instruction: {instruction}
|
| 27 |
+
### Source: Document_Name.pdf
|
| 28 |
+
### Content: Your Query Here
|
| 29 |
### Verified Response:"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 32 |
+
outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)
|
| 33 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Verified Response:")[-1].strip())
|
| 34 |
```
|
| 35 |
|
| 36 |
+
## π Training methodology
|
| 37 |
+
- **Base Model**: google/gemma-2b
|
| 38 |
+
- **Quantization**: 4-bit (NormalFloat4)
|
| 39 |
+
- **LoRA Config**: r=16, alpha=32, target_modules=All linears
|
| 40 |
+
- **Epochs**: 5 (Intensive Reinforcement)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
---
|
| 43 |
+
**Fine-tuned by Bibek Lama Singtan**
|