jaykv commited on
Commit
19d65fc
·
verified ·
1 Parent(s): 4b269de

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +281 -185
README.md CHANGED
@@ -1,202 +1,298 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  base_model: defog/llama-3-sqlcoder-8b
3
  library_name: peft
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
 
11
 
12
- ## Model Details
13
 
14
- ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
 
18
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
-
52
- ### Out-of-Scope Use
53
-
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
-
58
- ## Bias, Risks, and Limitations
59
-
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
-
76
- ## Training Details
77
-
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
 
197
- ## Model Card Contact
 
 
 
 
 
 
 
 
198
 
199
- [More Information Needed]
200
- ### Framework versions
201
 
202
- - PEFT 0.10.0
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - text-to-sql
7
+ - sql
8
+ - tally
9
+ - accounting
10
+ - erp
11
+ - llama-3
12
+ - sqlcoder
13
+ - postgresql
14
+ - finance
15
  base_model: defog/llama-3-sqlcoder-8b
16
  library_name: peft
17
+ pipeline_tag: text-generation
18
  ---
19
 
20
+ # Tally SQLCoder - Fine-tuned for TallyPrime ERP
21
 
22
+ **Created by:** Jay Viramgami
23
 
24
+ A fine-tuned LLaMA 3 SQLCoder model specialized for converting natural language questions to PostgreSQL queries for **TallyPrime ERP** systems.
25
 
26
+ ## 🎯 Model Description
27
 
28
+ This model is specifically trained to understand accounting and business terminology used in TallyPrime ERP and generate accurate SQL queries for a PostgreSQL database migrated from Tally.
29
 
30
+ ### Key Features
31
 
32
+ - 🏦 **Accounting Domain Expertise** - Understands financial terms, GST, vouchers, ledgers
33
+ - 📊 **28 Database Tables** - Covers all master and transaction tables from Tally
34
+ - 🎓 **ICAI Compliant** - Based on Indian accounting standards
35
+ - 🚀 **Fast Inference** - Optimized with QLoRA for efficient deployment
36
+ - 💯 **High Accuracy** - Fine-tuned on 5,000+ Tally-specific query pairs
37
+
38
+ ### Use Cases
39
+
40
+ - Customer receivables and vendor payables analysis
41
+ - Sales and purchase reporting
42
+ - Inventory and stock management queries
43
+ - GST and tax compliance reports
44
+ - Financial statements (Profit & Loss, Balance Sheet)
45
+ - Voucher and transaction searches
46
+
47
+ ## 📊 Model Details
48
+
49
+ - **Base Model:** [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
50
+ - **Fine-tuning Method:** QLoRA (4-bit quantization)
51
+ - **Training Data:** 5,000 synthetic Tally accounting text-to-SQL pairs
52
+ - **Target Database:** PostgreSQL (Tally migration schema with 28 tables)
53
+ - **Training Platform:** Kaggle (NVIDIA T4 GPU)
54
+ - **Training Time:** ~4 hours
55
+ - **Final Training Loss:** 0.05-0.07
56
+
57
+ ### Query Categories Supported
58
+
59
+ | Category | Templates | Examples |
60
+ |----------|-----------|----------|
61
+ | Simple Filters | 20 | "Show all customers", "List bank accounts" |
62
+ | Date Ranges | 20 | "Sales in March 2024", "Payments this quarter" |
63
+ | Aggregations | 25 | "Total sales amount", "Top 10 customers" |
64
+ | Joins | 25 | "Customer-wise outstanding", "Item sales by godown" |
65
+ | Accounting | 20 | "P&L items", "Assets and liabilities" |
66
+ | GST/Tax | 15 | "GST collected", "TDS deducted" |
67
+ | Inventory | 15 | "Stock movements", "Items with zero balance" |
68
+ | Financial Statements | 10 | "Trial balance", "Balance sheet data" |
69
+
70
+ ## 🚀 Quick Start
71
+
72
+ ### Installation
73
+
74
+ ```bash
75
+ pip install transformers peft torch accelerate bitsandbytes
76
+ ```
77
+
78
+ ### Basic Usage
79
+
80
+ ```python
81
+ from transformers import AutoModelForCausalLM, AutoTokenizer
82
+ from peft import PeftModel
83
+ import torch
84
+
85
+ # Load base model
86
+ base_model = AutoModelForCausalLM.from_pretrained(
87
+ "defog/llama-3-sqlcoder-8b",
88
+ device_map="auto",
89
+ torch_dtype=torch.float16
90
+ )
91
+
92
+ # Load fine-tuned adapter
93
+ model = PeftModel.from_pretrained(base_model, "jaykv/tally-sqlcoder-finetuned")
94
+ tokenizer = AutoTokenizer.from_pretrained("jaykv/tally-sqlcoder-finetuned")
95
+
96
+ # Generate SQL
97
+ question = "Show all customers with outstanding balance above 50000"
98
+ schema = """CREATE TABLE mst_ledger (
99
+ name VARCHAR(1024),
100
+ parent VARCHAR(1024),
101
+ closing_balance DECIMAL(17,2)
102
+ );"""
103
+
104
+ prompt = f"""### Task
105
+ Generate a SQL query to answer [QUESTION]{question}[/QUESTION]
106
+
107
+ ### Database Schema
108
+ The query will run on a database with the following schema:
109
+ {schema}
110
+
111
+ ### Answer
112
+ Given the database schema, here is the SQL query that answers [QUESTION]{question}[/QUESTION]
113
+ [SQL]"""
114
+
115
+ inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=2048).to(model.device)
116
+
117
+ with torch.no_grad():
118
+ outputs = model.generate(
119
+ **inputs,
120
+ max_new_tokens=300,
121
+ temperature=0.1,
122
+ do_sample=True,
123
+ pad_token_id=tokenizer.eos_token_id
124
+ )
125
+
126
+ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
127
+ sql = result.split("[SQL]")[-1].strip()
128
+ print(sql)
129
+ ```
130
+
131
+ **Output:**
132
+ ```sql
133
+ SELECT name, closing_balance
134
+ FROM mst_ledger
135
+ WHERE parent = 'Sundry Debtors'
136
+ AND closing_balance > 50000
137
+ ```
138
+
139
+ ## 📝 Example Queries
140
+
141
+ ### Simple Customer Query
142
+ **Question:** "Show all customers"
143
+ **Generated SQL:**
144
+ ```sql
145
+ SELECT name FROM mst_ledger WHERE parent = 'Sundry Debtors'
146
+ ```
147
+
148
+ ### Sales Analysis
149
+ **Question:** "What is the total sales amount for March 2024?"
150
+ **Generated SQL:**
151
+ ```sql
152
+ SELECT SUM(ABS(amount)) as total_sales
153
+ FROM trn_accounting ta
154
+ JOIN trn_voucher tv ON ta.guid = tv.guid
155
+ WHERE tv.voucher_type = 'Sales'
156
+ AND tv.date BETWEEN '2024-03-01' AND '2024-03-31'
157
+ ```
158
+
159
+ ### Top Customers
160
+ **Question:** "Show top 10 customers by sales"
161
+ **Generated SQL:**
162
+ ```sql
163
+ SELECT tv.party_name, SUM(ABS(ta.amount)) as total_sales
164
+ FROM trn_voucher tv
165
+ JOIN trn_accounting ta ON tv.guid = ta.guid
166
+ WHERE tv.voucher_type = 'Sales'
167
+ GROUP BY tv.party_name
168
+ ORDER BY total_sales DESC
169
+ LIMIT 10
170
+ ```
171
+
172
+ ### GST Query
173
+ **Question:** "Show GST payable amount"
174
+ **Generated SQL:**
175
+ ```sql
176
+ SELECT name, closing_balance
177
+ FROM mst_ledger
178
+ WHERE parent = 'Duties & Taxes'
179
+ AND name LIKE '%GST%'
180
+ ```
181
+
182
+ ## 🗄️ Database Schema
183
+
184
+ The model is trained on a PostgreSQL schema with **28 tables** from TallyPrime:
185
+
186
+ ### Master Tables (15)
187
+ - `mst_ledger` - Customers, vendors, banks, expenses, incomes
188
+ - `mst_group` - Account group hierarchy
189
+ - `mst_stock_item` - Inventory items with GST details
190
+ - `mst_stock_group` - Stock categories
191
+ - `mst_vouchertype` - Voucher type definitions
192
+ - `mst_godown` - Warehouse locations
193
+ - `mst_cost_centre` - Cost centers
194
+ - And 8 more...
195
+
196
+ ### Transaction Tables (13)
197
+ - `trn_voucher` - All financial transactions
198
+ - `trn_accounting` - Ledger-wise entries
199
+ - `trn_inventory` - Item-wise stock movements
200
+ - `trn_bill` - Bill allocations
201
+ - `trn_bank` - Bank transaction details
202
+ - And 8 more...
203
+
204
+ ## 📈 Training Details
205
+
206
+ ### Dataset
207
+ - **Size:** 5,000 text-to-SQL pairs
208
+ - **Source:** Synthetically generated using 150 query templates
209
+ - **Split:** 90/10 train/test
210
+ - **Categories:** 8 query types covering all Tally operations
211
+
212
+ ### Training Configuration
213
+ - **Method:** QLoRA (Quantized Low-Rank Adaptation)
214
+ - **Quantization:** 4-bit (NF4)
215
+ - **LoRA Rank:** 16
216
+ - **LoRA Alpha:** 32
217
+ - **Target Modules:** q_proj, k_proj, v_proj, o_proj
218
+ - **Batch Size:** 2 per device
219
+ - **Gradient Accumulation:** 4 steps
220
+ - **Learning Rate:** 2e-4
221
+ - **Epochs:** ~3 (1,600 steps)
222
+ - **Optimizer:** PagedAdamW 8-bit
223
+ - **Max Sequence Length:** 2048 tokens
224
+
225
+ ### Hardware
226
+ - **Platform:** Kaggle Notebooks
227
+ - **GPU:** NVIDIA T4 (16GB)
228
+ - **Training Time:** ~4 hours
229
+
230
+ ## 📊 Performance
231
+
232
+ - **Valid SQL Syntax:** >95%
233
+ - **Keyword Match:** >85%
234
+ - **Exact Match (normalized):** >70%
235
+
236
+ ## ⚠️ Limitations
237
+
238
+ - **Tally-Specific:** Optimized for TallyPrime PostgreSQL schema
239
+ - **PostgreSQL Only:** SQL generated for PostgreSQL dialect
240
+ - **Schema Required:** Needs database schema in the prompt
241
+ - **Context Window:** Limited to 2048 tokens
242
+ - **Custom Schemas:** May require additional fine-tuning for non-Tally schemas
243
+
244
+ ## 🔧 Deployment Tips
245
+
246
+ ### For Production Use:
247
+ 1. **Add validation** - Verify generated SQL before execution
248
+ 2. **Read-only mode** - Restrict to SELECT queries only
249
+ 3. **Query timeout** - Set execution time limits
250
+ 4. **Error handling** - Catch and handle syntax errors
251
+ 5. **Logging** - Track all queries for audit
252
+
253
+ ### Optimization:
254
+ - Use GPU for faster inference (2-3 seconds per query)
255
+ - CPU inference works but is slower (~10-15 seconds)
256
+ - Consider caching frequently asked queries
257
+
258
+ ## 📄 License
259
+
260
+ This model is released under the **Apache 2.0** license, inheriting from the base model.
261
+
262
+ ## 🙏 Acknowledgments
263
+
264
+ - **Base Model:** [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) by Defog.ai
265
+ - **Training Standards:** ICAI (Institute of Chartered Accountants of India) Foundation Course
266
+ - **Platform:** Trained on Kaggle's free GPU infrastructure
267
+
268
+ ## 📧 Contact
269
+
270
+ **Author:** Jay Viramgami
271
+
272
+ For questions, feedback, or collaboration inquiries, please open an issue on the model's discussion page.
273
+
274
+ ## 🔗 Related Resources
275
+
276
+ - [TallyPrime ERP](https://tallysolutions.com/)
277
+ - [Defog SQLCoder](https://github.com/defog-ai/sqlcoder)
278
+ - [PEFT Library](https://github.com/huggingface/peft)
279
 
280
+ ---
281
 
282
+ ## Citation
283
 
284
+ If you use this model in your work, please cite:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
285
 
286
+ ```bibtex
287
+ @misc{tally-sqlcoder-finetuned,
288
+ author = {Jay Viramgami},
289
+ title = {Tally SQLCoder - Fine-tuned for TallyPrime ERP},
290
+ year = {2024},
291
+ publisher = {HuggingFace},
292
+ url = {https://huggingface.co/jaykv/tally-sqlcoder-finetuned}
293
+ }
294
+ ```
295
 
296
+ ---
 
297
 
298
+ **Model Card created by Jay Viramgami | March 2024**