| | --- |
| | base_model: |
| | - ajibawa-2023/Code-Llama-3-8B |
| | - defog/llama-3-sqlcoder-8b |
| | library_name: transformers |
| | tags: |
| | - mergekit |
| | - merge |
| |
|
| | --- |
| | # llama3-8b-code-sql-slerp |
| |
|
| | llama3-8b-code-sql-slerp is a merge of two fine tuned Llama 3 8B models for coding, intended to have a solid programming foundation with an expertise in SQL. |
| |
|
| | ### 🤏 Models Merged |
| |
|
| | Merge of pre-trained language models merged using the SLERP merge method with [mergekit](https://github.com/cg123/mergekit). |
| |
|
| | The following models were included in the merge: |
| | * [ajibawa-2023/Code-Llama-3-8B](https://huggingface.co/ajibawa-2023/Code-Llama-3-8B) |
| | * [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) |
| |
|
| | ### 🧩 Configuration |
| |
|
| | The following YAML configuration was used to produce this model: |
| |
|
| | ```yaml |
| | slices: |
| | - sources: |
| | - model: ajibawa-2023/Code-Llama-3-8B |
| | layer_range: [0, 32] |
| | - model: defog/llama-3-sqlcoder-8b |
| | layer_range: [0, 32] |
| | merge_method: slerp |
| | base_model: ajibawa-2023/Code-Llama-3-8B |
| | parameters: |
| | t: |
| | - filter: self_attn |
| | value: [0, 0.3, 0.5, 0.7, 0.5] |
| | - filter: mlp |
| | value: [0, 0.3, 0.5, 0.7, 0.5] |
| | - value: 0.4 # fallback for rest of tensors |
| | dtype: bfloat16 |
| | ``` |
| |
|
| | ### 💻 Usage |
| |
|
| | Loading in 8-bit Quantization |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("AdamLucek/llama3-8b-code-sql-slerp") |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "AdamLucek/llama3-8b-code-sql-slerp", |
| | device_map="cuda", |
| | quantization_config=BitsAndBytesConfig(load_in_8bit=True) |
| | ) |
| | |
| | # Prepare the input text |
| | input_text = "Can you write a query to retrieve the names and email addresses of all customers who have made purchases totaling over $1000 in the last month from our 'sales' database?" |
| | input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") |
| | |
| | # Generate the output |
| | outputs = model.generate( |
| | **input_ids, |
| | max_new_tokens=256, |
| | pad_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | # Decode and print the generated text |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | **Output** |
| | ``` |
| | \```sql |
| | SELECT c.name, c.email |
| | FROM customers c |
| | JOIN sales s ON c.customer_id = s.customer_id |
| | WHERE s.purchase_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH) |
| | GROUP BY c.name, c.email |
| | HAVING SUM(s.amount) > 1000; |
| | \``` |
| | |
| | This query joins the 'customers' and'sales' tables on the 'customer_id' field, filters for sales made in the last month, groups the results by customer name and email, and then applies a condition to only include customers whose total purchase amount exceeds $1000. The result will be a list of names and email addresses for customers who have made purchases totaling over $1000 in the last month. |
| | ``` |
| | *backslash added for formatting* |