Spaces:

hikewa
/

dialectic-reasoning

Sleeping

App Files Files Community

hikewa commited on Apr 3

Commit

be2e047

verified ·

1 Parent(s): cd4a477

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +20 -32

README.md CHANGED Viewed

@@ -12,49 +12,37 @@ pinned: false
 # Dialectic Reasoning
-Interactive demo for the **dialectic LoRA model family**, with the **Qwen3-8B variant** as the primary model.
-This Space is meant to demonstrate a specific capability:
-- better **crux identification**
-- stronger **conditional commitment**
-- deeper **integrative resolution**
-It is **not** just a “balanced conversation” bot and it is **not** intended as evidence by itself. The supporting evaluation artifacts live in the associated dataset/model repos.
-## What This Demo Represents
-The strongest current result in the family is the **8B LoRA**:
-- base model: `Qwen/Qwen3-8B`
-- trained on **408 examples** drawn from a larger **510-trace internal corpus**
-- evaluated on held-out prompts with a rubric focused on real synthesis behavior
-Smaller family members also exist, but they should be treated as exploratory variants rather than equivalent peers.
-## Main Result
-On a held-out rubric evaluation, the fine-tuned 8B model improved substantially over base Qwen3-8B on:
-- **Conditional commitment**
-- **Actionability**
-- **Resolution depth**
-- **Crux clarity**
-It also reduced weak and bad outputs, although generic hedge language is still too common.
-## Read This As A Demo, Not The Whole Claim
-Use the Space to get a feel for the behavior.
-For the actual methodology and published reports, see:
-- model: `hikewa/dialectic-qwen3-8b-lora`
-- dataset + eval artifacts: `hikewa/dialectic-reasoning-traces`
 ## Limitations
 - The Space is a demo wrapper, not a research paper
-- Public dataset release is smaller than the full internal corpus used for the 8B model
-- The model can still sound diplomatic or over-general on some prompts
-- Stronger evidence comes from held-out evaluation, not from an isolated chat impression

 # Dialectic Reasoning
+Interactive demo for the **dialectic LoRA model family**, fine-tuned to identify genuine tensions, make conditional commitments, and reach integrative resolutions instead of hedging.
+## Current Best: 4B v3
+The strongest model in the family is the **[Qwen3-4B v3 LoRA](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)**:
+- Trained on **507 examples** (408 original + 99 domain-diverse traces from 3 model families)
+- Rubric avg: **9.8/10** — all 14 held-out prompts score "strong"
+- generic_hedge: **0.00** (eliminated)
+The earlier 8B model (6.6/10 on 408 traces) demonstrated that data diversity matters more than model size.
+## What This Demo Shows
+- **Crux identification** — finding the real decision point
+- **Conditional commitment** — "if X, then Y; if Z, then W"
+- **Integrative resolution** — not "both sides have merit" but concrete synthesis
+This is **not** a balanced conversation bot. It is a demo of a specific trained capability.
+## Evidence
+For methodology and evaluation:
+- Best model: [hikewa/dialectic-qwen3-4b-v3-lora](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)
+- 8B model: [hikewa/dialectic-qwen3-8b-lora](https://huggingface.co/hikewa/dialectic-qwen3-8b-lora)
+- Dataset + eval artifacts: [hikewa/dialectic-reasoning-traces](https://huggingface.co/datasets/hikewa/dialectic-reasoning-traces)
 ## Limitations
 - The Space is a demo wrapper, not a research paper
+- Training data is synthetic (multi-model generated)
+- English-only
+- Stronger evidence comes from held-out evaluation, not from chat impressions