singtan commited on
Commit
5586bf7
Β·
verified Β·
1 Parent(s): bda6ac6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +18 -49
README.md CHANGED
@@ -1,35 +1,19 @@
1
  ---
2
  {}
3
  ---
4
- ---
5
- license: apache-2.0
6
- library_name: transformers
7
- base_model: google/gemma-2b
8
- tags:
9
- - text-generation
10
- - fine-tuned
11
- - pdf-grounded
12
- - zero-hallucination
13
- - domain-specific
14
- language:
15
- - en
16
- pipeline_tag: text-generation
17
- ---
18
 
19
- # πŸ“‚ Solvrays Llm (Precision Grounded)
20
 
21
  ## 🌟 Overview
22
- This model is a high-precision version of **Gemma 2B**, fine-tuned for **strict document retrieval and analysis**. It has been conditioned through deterministic instruction-grounding templates to minimize hallucinations and prioritize facts extracted directly from provided source materials.
23
-
24
- ### πŸ›  Core Advancements
25
- - **Universal Document Grounding**: Optimized for both technical and non-technical corpora.
26
- - **Negative Constraint Training**: Trained to prioritize "Not Documented" over guessing when information is missing.
27
- - **Deterministic Inference**: Configured for greedy decoding to ensure factual consistency.
28
- - **High Continuity**: Trained with 128-token chunk overlap to preserve context across page boundaries.
29
 
30
- ## πŸ’» Precision Quick Start
31
- To ensure zero-hallucination, use the following **Universal Grounding Prompt**:
 
 
 
32
 
 
33
  ```python
34
  from transformers import AutoTokenizer, AutoModelForCausalLM
35
  import torch
@@ -38,37 +22,22 @@
38
  tokenizer = AutoTokenizer.from_pretrained(model_id)
39
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
40
 
41
- # The EXACT prompt used during fine-tuning grounding
42
  instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented."
43
- document_source = "Your_Document_Name.pdf"
44
- query = "Explain the key requirements from the document."
45
-
46
  prompt = f"### Instruction: {instruction}
47
- ### Source: {document_source}
48
- ### Content: {query}
49
  ### Verified Response:"
50
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
51
-
52
- with torch.no_grad():
53
- outputs = model.generate(
54
- **inputs,
55
- max_new_tokens=256,
56
- do_sample=False, # Absolute precision mode
57
- repetition_penalty=1.5
58
- )
59
 
 
 
60
  print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Verified Response:")[-1].strip())
61
  ```
62
 
63
- ## πŸ“Š Training Methodology
64
- - **Model Foundation**: google/gemma-2b
65
- - **Technique**: QLoRA with 4-bit NormalFloat (nf4) quantization.
66
- - **Rank (r)**: 16 | **Alpha**: 32 (Balanced capacity/stability).
67
- - **Epochs**: 5 (Intensive fact-reinforcement).
68
- - **Hardware**: Optimized for NVIDIA L4/V100/A100 environments.
69
-
70
- ## πŸ“œ License & Usage
71
- Usage is governed by the Apache-2.0 license and the Gemma Prohibited Use Policy. Ideal for internal technical summary, compliance checking, and document search tasks.
72
 
73
  ---
74
- **Professionally Fine-tuned by Bibek Lama Singtan**
 
1
  ---
2
  {}
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ # πŸ“‚ Solvrays Llm (Ground-Truth Precise)
6
 
7
  ## 🌟 Overview
8
+ This is a specialized fine-tuned version of **Gemma 2B**, optimized for **High-Precision Document Retrieval**. It has been trained using strict grounding templates to ensure zero-hallucination and deterministic factual responses.
 
 
 
 
 
 
9
 
10
+ ## πŸ›  Key Advanced Features
11
+ - **Zero-Hallucination Mode**: Deterministic greedy decoding by default.
12
+ - **Negative Constraint Awareness**: Trained to avoid guessing when information is missing.
13
+ - **Domain Agnostic**: Works for any technical or non-technical PDF provided as context.
14
+ - **Standalone Conversion**: Fully merged FP16 weights for production deployment.
15
 
16
+ ## πŸ’» Quick Start (Inference)
17
  ```python
18
  from transformers import AutoTokenizer, AutoModelForCausalLM
19
  import torch
 
22
  tokenizer = AutoTokenizer.from_pretrained(model_id)
23
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
24
 
 
25
  instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented."
 
 
 
26
  prompt = f"### Instruction: {instruction}
27
+ ### Source: Document_Name.pdf
28
+ ### Content: Your Query Here
29
  ### Verified Response:"
 
 
 
 
 
 
 
 
 
30
 
31
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
32
+ outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)
33
  print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Verified Response:")[-1].strip())
34
  ```
35
 
36
+ ## πŸ“Š Training methodology
37
+ - **Base Model**: google/gemma-2b
38
+ - **Quantization**: 4-bit (NormalFloat4)
39
+ - **LoRA Config**: r=16, alpha=32, target_modules=All linears
40
+ - **Epochs**: 5 (Intensive Reinforcement)
 
 
 
 
41
 
42
  ---
43
+ **Fine-tuned by Bibek Lama Singtan**