readme
Browse files
README.md
CHANGED
|
@@ -77,34 +77,38 @@ Plaintext
|
|
| 77 |
+-------------------------------------+
|
| 78 |
|
| 79 |
```
|
| 80 |
-
## Methodology
|
| 81 |
----------------------------------------------------
|
|
|
|
| 82 |
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
- By distilling knowledge from a high-parameter Teacher into a 7B model, we significantly reduce computational requirements without sacrificing reasoning capability.
|
| 86 |
-
- The approach captures the **brains of domain experts**, built upon decades of domain expertise and engineering practices.
|
| 87 |
-
- Optimized training and alignment ensure that the model delivers high accuracy with minimal resource consumption.
|
| 88 |
-
- This enables deployment on cost-efficient infrastructure, including edge environments, while maintaining enterprise-grade performance.
|
| 89 |
|
| 90 |
-
|
|
|
|
| 91 |
|
| 92 |
-
|
|
|
|
|
|
|
| 93 |
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
- **Technical Authority:** Demonstrates the use of advanced alignment techniques tailored for domain-aware reasoning and real-world system constraints.
|
| 97 |
|
| 98 |
-
|
|
|
|
| 99 |
|
| 100 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
|
| 102 |
-
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
-
3. **RLAIF (Reinforcement Learning from AI Feedback):** To ensure high reliability, we implemented an RLAIF loop where the Teacher model (Gemini Pro) acted as an automated evaluator. The model was guided to improve accuracy, consistency, and context-aware decision-making through continuous feedback.
|
| 105 |
|
| 106 |
-
4. **The Result:** A model that possesses the intelligence of a much larger AI system while operating with the speed and cost-efficiency required for real-time industrial monitoring and analytics.
|
| 107 |
-
----------------------------------------------------
|
| 108 |
* * * * *
|
| 109 |
|
| 110 |
|
|
|
|
| 77 |
+-------------------------------------+
|
| 78 |
|
| 79 |
```
|
| 80 |
+
## Methodology of Training
|
| 81 |
----------------------------------------------------
|
| 82 |
+
To achieve high-fidelity reasoning in a compact 7B parameter footprint, Energy-Intelligence was developed through a **Distillation & RLHF Architecture**:
|
| 83 |
|
| 84 |
+
1. **RLHF (Reinforcement Learning from Human Feedback):**
|
| 85 |
+
Human evaluators review multiple responses generated by the model and select the better one. The model improves based on these preferences, making it more accurate, helpful, and aligned with real-world expectations.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
2. **Synthetic Data Generation:**
|
| 88 |
+
We utilized synthetic data generated by the Teacher model to capture domain knowledge and real-world scenarios, enabling scalable training with improved accuracy and coverage of complex use cases.
|
| 89 |
|
| 90 |
+
3. **Distillation:**
|
| 91 |
+
- **The Oracle (Teacher):** We utilized Gemini Pro as a high-parameter teacher model, providing it with domain knowledge, business logic, and complex system understanding to generate high-quality learning data.
|
| 92 |
+
- **The Specialist (Student):** The Qwen2.5-7B-Instruct base model was fine-tuned on this curated dataset, effectively capturing the Teacher’s advanced reasoning in a more efficient form.
|
| 93 |
|
| 94 |
+
4. **The Result:**
|
| 95 |
+
A model that possesses the intelligence of a much larger AI system while operating with the speed and cost-efficiency required for real-time industrial monitoring and analytics.
|
|
|
|
| 96 |
|
| 97 |
+
----------------------------------------------------
|
| 98 |
+
### Why Adding RLHF Matters for the Model Card
|
| 99 |
|
| 100 |
+
- **Precision:** The model is refined using human feedback, improving the quality and reliability of responses.
|
| 101 |
+
- **Domain Safety:** Reduces the risk of incorrect outputs that could impact critical energy operations.
|
| 102 |
+
- **Human Alignment:** Ensures the model behaves in a helpful, consistent, and context-aware manner aligned with human expectations.
|
| 103 |
+
|
| 104 |
+
Our methodology focuses on embedding the intelligence of large-scale systems into a compact and efficient architecture:
|
| 105 |
|
| 106 |
+
- By distilling knowledge from a high-parameter Teacher into a 7B model, we significantly reduce computational requirements without sacrificing reasoning capability.
|
| 107 |
+
- The approach captures the **brains of domain experts**, built upon decades of domain expertise and engineering practices.
|
| 108 |
+
- Optimized training and alignment ensure that the model delivers high accuracy with minimal resource consumption.
|
| 109 |
+
- This enables deployment on cost-efficient infrastructure, including edge environments, while maintaining enterprise-grade performance.
|
| 110 |
|
|
|
|
| 111 |
|
|
|
|
|
|
|
| 112 |
* * * * *
|
| 113 |
|
| 114 |
|