-
Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 24 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14
Lakshan Cooray
Lakshan2003
AI & ML interests
Natural Language Generation, LLMs, SLMs, LLM based Evaluation, Graph RAG, Industrial NLP
Recent Activity
updated a dataset 1 day ago
Lakshan2003/context-summarization-llm-judge-results published a dataset 1 day ago
Lakshan2003/context-summarization-llm-judge-results updated a model 1 day ago
Lakshan2003/Llama-3.1-8B-Instruct-customerservice-context-summaryOrganizations
None yet
Customer Service Context Summarization Fine-tuned Models
Fine-tuned models for context summarization in multi-turn customer service conversations.
-
Lakshan2003/Qwen3-8B-Instruct-customerservice-context-summary
Summarization • Updated • 53 -
Lakshan2003/Phi-4-mini-instruct-customerservice-context-summary
Summarization • Updated • 55 -
Lakshan2003/Qwen3-4B-instruct-customerservice-context-summary
Summarization • Updated • 64 -
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summary
Text Generation • Updated • 51
Customer Service Context Summarization Evaluation Data
Per-model evaluation datasets (~10k rows each) for context summarization experiments in customer service conversations.
-
Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 24 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14
Customer Service Human Evaluation Data (Evaluator 3)
Per-model human evaluation datasets (evaluator_3) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 6 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 5 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 7
Customer Service Human Evaluation Data (Evaluator 1)
Per-model human evaluation datasets (evaluator_1) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 7
Pairwise Comparison (Gemini-2.5-Flash vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Gemini-2.5-Flash on customer service client-agent conversations.
-
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-1.7b
Viewer • Updated • 1k • 7 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-4b
Viewer • Updated • 1k • 10 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-8b
Viewer • Updated • 1k • 6 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-phi-4-mini
Viewer • Updated • 1k • 7
Customer Service LLM-as-a-Judge Evaluation Data
Per-model LLM-as-a-Judge evaluation datasets (~6k rows each) generated for customer service client-agent conversations.
-
Lakshan2003/Qwen3-1.7B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 10 -
Lakshan2003/Qwen3-4B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 6 -
Lakshan2003/Qwen3-8B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 12 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 13
Customer Service Context Summarization LLM-as-a-judge
-
Lakshan2003/gemini-2.5-flash-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 27 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 22 -
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 26 -
Lakshan2003/Phi-4-mini-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 23
Customer Service QA Fine-tuned SLMs
Fine-tuned SLMs for context-summarized multi-turn customer service response generation.
-
Lakshan2003/SmolLM3-3B-instruct-customerservice
Text Generation • Updated • 2 -
Lakshan2003/Qwen3-4B-instruct-customerservice
Text Generation • Updated -
Lakshan2003/Phi-4-mini-instruct-customerservice
Text Generation • Updated -
Lakshan2003/Qwen3-1.7B-instruct-customerservice
Text Generation • Updated • 1
SLM Cost Benchmarking Datasets
Datasets used for benchmarking computational cost and inference efficiency of SLMs in customer service QA experiments.
-
Lakshan2003/slm-cost-benchmark-testset-1000
Viewer • Updated • 1k • 7 -
Lakshan2003/slm-cost-benchmark-final-summary
Viewer • Updated • 9 • 10 -
Lakshan2003/Qwen3-1.7B-Instruct-cost-benchmark-results
Viewer • Updated • 1k • 8 -
Lakshan2003/Qwen3-4B-Instruct-cost-benchmark-results
Viewer • Updated • 1k • 10
Customer Service Human Evaluation Data (Evaluator 2)
Per-model human evaluation datasets (evaluator_2) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 6 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5
Pairwise Comparison Datasets (Virtuoso-Large vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Virtuoso-Large on customer service client-agent conversations.
-
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-1.7b
Viewer • Updated • 1k • 7 -
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-4b
Viewer • Updated • 1k • 8 -
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-8b
Viewer • Updated • 1k • 6 -
Lakshan2003/pairwise-virtuoso-large-vs-phi-4-mini
Viewer • Updated • 1k • 10
Pairwise Comparison (GPT-4.1 vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against GPT-4.1.
Customer Service SLM/LLM Inference Outputs
Per-model inference outputs from SLMs and LLMs evaluated on customer service client-agent conversations.
-
Lakshan2003/Qwen3-1.7B-customerservice-evaldata
Viewer • Updated • 36.7k • 13 -
Lakshan2003/Qwen3-4B-customerservice-evaldata
Viewer • Updated • 36.7k • 4 • 1 -
Lakshan2003/Qwen3-8B-customerservice-evaldata
Viewer • Updated • 36.7k • 7 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-evaldata
Viewer • Updated • 36.7k • 13
Context Summarization Model Inference Outputs
-
Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 24 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14
Customer Service Context Summarization LLM-as-a-judge
-
Lakshan2003/gemini-2.5-flash-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 27 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 22 -
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 26 -
Lakshan2003/Phi-4-mini-customerservice-context-summarization-llm-judge-data
Viewer • Updated • 1k • 23
Customer Service Context Summarization Fine-tuned Models
Fine-tuned models for context summarization in multi-turn customer service conversations.
-
Lakshan2003/Qwen3-8B-Instruct-customerservice-context-summary
Summarization • Updated • 53 -
Lakshan2003/Phi-4-mini-instruct-customerservice-context-summary
Summarization • Updated • 55 -
Lakshan2003/Qwen3-4B-instruct-customerservice-context-summary
Summarization • Updated • 64 -
Lakshan2003/Llama3.2-3B-instruct-customerservice-context-summary
Text Generation • Updated • 51
Customer Service QA Fine-tuned SLMs
Fine-tuned SLMs for context-summarized multi-turn customer service response generation.
-
Lakshan2003/SmolLM3-3B-instruct-customerservice
Text Generation • Updated • 2 -
Lakshan2003/Qwen3-4B-instruct-customerservice
Text Generation • Updated -
Lakshan2003/Phi-4-mini-instruct-customerservice
Text Generation • Updated -
Lakshan2003/Qwen3-1.7B-instruct-customerservice
Text Generation • Updated • 1
Customer Service Context Summarization Evaluation Data
Per-model evaluation datasets (~10k rows each) for context summarization experiments in customer service conversations.
-
Lakshan2003/Llama3.2-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 24 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-8B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14 -
Lakshan2003/Qwen3-4B-customerservice-context-summarization-evaldata
Viewer • Updated • 10k • 14
SLM Cost Benchmarking Datasets
Datasets used for benchmarking computational cost and inference efficiency of SLMs in customer service QA experiments.
-
Lakshan2003/slm-cost-benchmark-testset-1000
Viewer • Updated • 1k • 7 -
Lakshan2003/slm-cost-benchmark-final-summary
Viewer • Updated • 9 • 10 -
Lakshan2003/Qwen3-1.7B-Instruct-cost-benchmark-results
Viewer • Updated • 1k • 8 -
Lakshan2003/Qwen3-4B-Instruct-cost-benchmark-results
Viewer • Updated • 1k • 10
Customer Service Human Evaluation Data (Evaluator 3)
Per-model human evaluation datasets (evaluator_3) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 6 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 5 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_3_data
Viewer • Updated • 500 • 7
Customer Service Human Evaluation Data (Evaluator 2)
Per-model human evaluation datasets (evaluator_2) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 6 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_2_data
Viewer • Updated • 500 • 5
Customer Service Human Evaluation Data (Evaluator 1)
Per-model human evaluation datasets (evaluator_1) for customer service client-agent conversations.
-
Lakshan2003/Llama-3.2-Instruct-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Phi-4-Mini-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/Qwen3-4B-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 5 -
Lakshan2003/GPT-4.1-customerservice-Human-evaluator_1_data
Viewer • Updated • 500 • 7
Pairwise Comparison Datasets (Virtuoso-Large vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Virtuoso-Large on customer service client-agent conversations.
-
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-1.7b
Viewer • Updated • 1k • 7 -
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-4b
Viewer • Updated • 1k • 8 -
Lakshan2003/pairwise-virtuoso-large-vs-qwen3-8b
Viewer • Updated • 1k • 6 -
Lakshan2003/pairwise-virtuoso-large-vs-phi-4-mini
Viewer • Updated • 1k • 10
Pairwise Comparison (Gemini-2.5-Flash vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against Gemini-2.5-Flash on customer service client-agent conversations.
-
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-1.7b
Viewer • Updated • 1k • 7 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-4b
Viewer • Updated • 1k • 10 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-qwen3-8b
Viewer • Updated • 1k • 6 -
Lakshan2003/pairwise-gemini-2.5-flash-vs-phi-4-mini
Viewer • Updated • 1k • 7
Pairwise Comparison (GPT-4.1 vs SLMs)
Pairwise comparison datasets used to evaluate SLM responses against GPT-4.1.
Customer Service LLM-as-a-Judge Evaluation Data
Per-model LLM-as-a-Judge evaluation datasets (~6k rows each) generated for customer service client-agent conversations.
-
Lakshan2003/Qwen3-1.7B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 10 -
Lakshan2003/Qwen3-4B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 6 -
Lakshan2003/Qwen3-8B-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 12 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-LLM-as-a-judge-data
Viewer • Updated • 6k • 13
Customer Service SLM/LLM Inference Outputs
Per-model inference outputs from SLMs and LLMs evaluated on customer service client-agent conversations.
-
Lakshan2003/Qwen3-1.7B-customerservice-evaldata
Viewer • Updated • 36.7k • 13 -
Lakshan2003/Qwen3-4B-customerservice-evaldata
Viewer • Updated • 36.7k • 4 • 1 -
Lakshan2003/Qwen3-8B-customerservice-evaldata
Viewer • Updated • 36.7k • 7 -
Lakshan2003/Llama3.1-8b-instruct-customerservice-evaldata
Viewer • Updated • 36.7k • 13