SerialKicked
/

Lethe-AI-Repo

@@ -2,57 +2,30 @@
 license: mit
 ---
-Two models required by the [**Lethe-AI C# Middleware Library**](https://github.com/SerialKicked/Lethe-AI-Sharp)
-## gte-large.Q6_K.gguf
-A Q6_K quantized version of [General Text Embeddings (GTE) model](https://huggingface.co/thenlper/gte-large) under MIT License. Used for all things RAG in the library.
-## emotion-bert-classifier.gguf
-A quantized version of [Emotions Analyzer](https://huggingface.co/logasanjeev/emotions-analyzer-bert) under MIT License, a fine-tuned BERT-base-uncased on GoEmotions for multi-label classification (28 emotions). Used for sentiment analysis tasks.
----
-# Lethe AI - A C# Middleware LLM Library
-Powerful, object-oriented, and highly configurable, general purpose library used to connect a local back-end running a Large Language Model (LLM) to a front-end. This library offers many tools and features for those who want to code their own C# front-end or LLM-powered tools without having to do all the heavy lifting. It can  happily connect itself to the most popular backends (the program loading the LLM proper) and allows your code to "speak" with the LLM is a few function calls. It includes easy-to-use (and easy to build upon) systems to handle most of the operations you'd wish to do with a LLM, alongside many advanced features like RAG, agentic systems, web search, text to speech, semantic similarity testing, and prompt manipulation.
-## 🧩 Compatible Backends
-- **Kobold API:** Used by [KoboldCpp](https://github.com/LostRuins/koboldcpp). This is the recommended backend with the most features.
-- **OpenAI API:** Used by [LM Studio](https://lmstudio.ai/), [Text Generation WebUI](https://github.com/oobabooga/text-generation-webui), and others. Less features.
-**Lethe AI** technically supports remote backends but this hasn't been tested, this library is mostly designed for local (or local network) LLM inference.
-## ⭐ Main Features
-- Easy to use classes for bot personas, system prompts, instruction formats, and inference settings
-- Session-based chatlog with automated summaries for past sessions
-- Streamed (or not) inference / reroll / and impersonate functions
-- Support CoT / "thinking" models out of the box
-- GBNF grammar generation directly from a class's structure for structured output
-- Basic support for VLM (visual language models) depending on the back-end
-- Tools for reliable Web Search (DuckDuckGo and Brave)
-- Text To Speech support (through the *Kobold API* only)
-- Many useful tools to manipulate text, count tokens, and more
-## 📝 Long Term Memory System
-- Keyword-triggered text insertions (also known as "world info" in many frontends)
-- Customizable RAG System using the Small World implementation
-- Automatic (optional) and configurable insertion of relevant past chat sessions into the context
-## 🧠 Agentic and Brain Module for personas
-- Background agent system (bot can run tasks in the background)
-- Analyze past chat sessions, run relevant web searches and mention results in next session
-- Mood tracking + drift system (personality coloring over time)
-- Goal‑driven behaviors (long‑term projects, self‑seeding topics of interest)
-## 🛠️ Advanced Features (Work in progress / experimental)
-- Group chat functionalities (one user and multiple AI characters)
-- Sentiment analysis
-## 👀 See it in action
-To demonstrate how powerful **Lethe AI** can be, check out [Lethe AI Chat](https://github.com/SerialKicked/Lethe-AI-Chat/). This is a powerful AI chat program for _Windows_ that uses most of the features present in the library. It comes with its own integrated editors, extended agentic tasks, and extensive settings. It can rival with most of the dedicated AI chat programs currently available.
-The library's source code is available here: [Lethe-AI Official Github](https://github.com/SerialKicked/Lethe-AI-Sharp)

 license: mit
 ---
+# Lethe AI Sharp
+Misc useful content for the [Lethe AI Sharp](https://github.com/SerialKicked/Lethe-AI-Sharp/) library. It is a modular, object‑oriented C# library that connects local or remote Large Language Model (LLM) backends to your applications (desktop tools, game engines, services). It also comes with its own light backend, allowing you to run a local LLM in the GGUF format directly without even having to rely on anything else.
+It unifies: chat personas, conversation/session management, streaming inference, long‑term memory, RAG (retrieval augmented generation), background agentic tasks, web search tools, TTS, and structured output generation. It is extensible, documented, and backend-agnostic (you write the same code no matter which backend is being used)
+**No Python Dependencies:** Pure .NET 10 C# implementation. No Python runtime, no conda environments, no pip hell.
+**Self-Contained:** Built-in LlamaSharp backend means you can distribute a single executable that runs LLMs locally. No external server required, but external servers are supported too.
+## Fixed ChatML Jinja Templates for Qwen 3.5
+This repo also contains fixed Jinja templates for Qwen 3.5 models. This one allows for system messages mid conversations (requirement for LetheAI) while removing an error that would trigger (at least) on LM Studio.
+Two models required by the [**Lethe-AI C# Middleware Library**](https://github.com/SerialKicked/Lethe-AI-Sharp)
+## gte-large.Q6_K.gguf
+A Q6_K quantized version of [General Text Embeddings (GTE) model](https://huggingface.co/thenlper/gte-large) under MIT License. Used for all things RAG in the library.
+## emotion-bert-classifier.gguf
+A quantized version of [Emotions Analyzer](https://huggingface.co/logasanjeev/emotions-analyzer-bert) under MIT License, a fine-tuned BERT-base-uncased on GoEmotions for multi-label classification (28 emotions). Used for sentiment analysis tasks.
+---