singtan commited on
Commit
8aacd19
Β·
verified Β·
1 Parent(s): 8243d60

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +12 -24
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: google/gemma-2b
3
  language: en
4
  library_name: transformers
5
  license: apache-2.0
@@ -13,60 +13,48 @@ tags:
13
  ---
14
 
15
  # πŸ“‚ Solvrays Llm - High Precision Document Analyst
16
-
17
- ## 🌟 Overview
18
- This model is a specialized fine-tuning of **google/gemma-2b**, engineered for **Zero-Hallucination Document Retrieval**. It has been optimized to handle complex, domain-specific documents (Technical, Legal, or Architectural) with strict adherence to provided context.
19
-
20
- ### πŸ›  Primary Design Objectives
21
  - **Factual Integrity**: Programmed to prioritize 'Not Documented' over speculating.
22
  - **Contextual Continuity**: Overlap-aware training prevents information loss across page boundaries.
23
  - **Domain Versatility**: Seamlessly switches between technical and non-technical document styles.
24
-
25
- ## πŸ’» Professional Usage (Grounded Inference)
26
  To achieve the trained precision level, utilize the following code implementation:
27
-
28
- ```python
29
  from transformers import AutoTokenizer, AutoModelForCausalLM
30
  import torch
31
 
32
  model_id = 'solvrays/solvrays-llm'
33
  tokenizer = AutoTokenizer.from_pretrained(model_id)
34
- model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=torch.float16)
35
 
36
  # Universal Grounding Template
37
- instruction = 'Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented.'
38
  query = 'What are the main infrastructure requirements?'
39
 
40
  prompt = (f'### Instruction: {instruction}\n'
41
  f'### Knowledge Context: {query}\n'
42
  f'### Verified Response:')
43
 
44
- prompt = (f'### Instruction: {instruction}
45
- '
46
- ### Knowledge Context: Extract the overview and key details from this document.
47
- f'### Verified Response:')
48
-
49
  inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
50
  with torch.no_grad():
51
  outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)
52
 
53
  print(tokenizer.decode(outputs[0], skip_special_tokens=True).split('### Verified Response:')[-1].strip())
54
  ```
55
-
56
- ## πŸ“Š Technical Specifications
57
  | Parameter | Configuration |
58
  | :--- | :--- |
59
- | Base Model | google/gemma-2b |
60
  | Fine-tuning Method | QLoRA (4-bit quantization) |
61
  | LoRA Rank (r) | 16 |
62
  | LoRA Alpha | 32 |
63
  | Training Epochs | 5 |
64
  | Context Strategy | 512 tokens with 128-token overlap |
65
-
66
- ## ⚠️ Risks & Limitations
67
  - **Context Window**: Strictly limited to the fine-tuned block size (512 tokens). For longer multi-page queries, RAG (Retrieval Augmented Generation) is recommended.
68
  - **Bias**: The model reflects the biases of the provided training documentation.
69
  - **Accuracy**: Always verify critical technical numbers against the original source.
70
-
71
- ---
72
  **Architected and Fine-tuned by Bibek Lama Singtan**
 
1
  ---
2
+ base_model: google/gemma-2b-it
3
  language: en
4
  library_name: transformers
5
  license: apache-2.0
 
13
  ---
14
 
15
  # πŸ“‚ Solvrays Llm - High Precision Document Analyst
16
+ \n## 🌟 Overview
17
+ This model is a specialized fine-tuning of **google/gemma-2b-it**, engineered for **Zero-Hallucination Document Retrieval**. It has been optimized to handle complex, domain-specific documents (Technical, Legal, or Architectural) with strict adherence to provided context.
18
+ \n### πŸ›  Primary Design Objectives
 
 
19
  - **Factual Integrity**: Programmed to prioritize 'Not Documented' over speculating.
20
  - **Contextual Continuity**: Overlap-aware training prevents information loss across page boundaries.
21
  - **Domain Versatility**: Seamlessly switches between technical and non-technical document styles.
22
+ \n## πŸ’» Professional Usage (Grounded Inference)
 
23
  To achieve the trained precision level, utilize the following code implementation:
24
+ \n```python
 
25
  from transformers import AutoTokenizer, AutoModelForCausalLM
26
  import torch
27
 
28
  model_id = 'solvrays/solvrays-llm'
29
  tokenizer = AutoTokenizer.from_pretrained(model_id)
30
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=torch.bfloat16)
31
 
32
  # Universal Grounding Template
33
+ instruction = 'Analyze your internal knowledge base and provide a precise, factual response based strictly on the documentation you have been trained on. If the information is not documented, state that it is not documented.'
34
  query = 'What are the main infrastructure requirements?'
35
 
36
  prompt = (f'### Instruction: {instruction}\n'
37
  f'### Knowledge Context: {query}\n'
38
  f'### Verified Response:')
39
 
 
 
 
 
 
40
  inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
41
  with torch.no_grad():
42
  outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)
43
 
44
  print(tokenizer.decode(outputs[0], skip_special_tokens=True).split('### Verified Response:')[-1].strip())
45
  ```
46
+ \n## πŸ“Š Technical Specifications
 
47
  | Parameter | Configuration |
48
  | :--- | :--- |
49
+ | Base Model | google/gemma-2b-it |
50
  | Fine-tuning Method | QLoRA (4-bit quantization) |
51
  | LoRA Rank (r) | 16 |
52
  | LoRA Alpha | 32 |
53
  | Training Epochs | 5 |
54
  | Context Strategy | 512 tokens with 128-token overlap |
55
+ \n## ⚠️ Risks & Limitations
 
56
  - **Context Window**: Strictly limited to the fine-tuned block size (512 tokens). For longer multi-page queries, RAG (Retrieval Augmented Generation) is recommended.
57
  - **Bias**: The model reflects the biases of the provided training documentation.
58
  - **Accuracy**: Always verify critical technical numbers against the original source.
59
+ \n---
 
60
  **Architected and Fine-tuned by Bibek Lama Singtan**