hikewa commited on
Commit
be2e047
·
verified ·
1 Parent(s): cd4a477

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -32
README.md CHANGED
@@ -12,49 +12,37 @@ pinned: false
12
 
13
  # Dialectic Reasoning
14
 
15
- Interactive demo for the **dialectic LoRA model family**, with the **Qwen3-8B variant** as the primary model.
16
 
17
- This Space is meant to demonstrate a specific capability:
18
 
19
- - better **crux identification**
20
- - stronger **conditional commitment**
21
- - deeper **integrative resolution**
22
 
23
- It is **not** just a “balanced conversation” bot and it is **not** intended as evidence by itself. The supporting evaluation artifacts live in the associated dataset/model repos.
 
 
24
 
25
- ## What This Demo Represents
26
 
27
- The strongest current result in the family is the **8B LoRA**:
28
 
29
- - base model: `Qwen/Qwen3-8B`
30
- - trained on **408 examples** drawn from a larger **510-trace internal corpus**
31
- - evaluated on held-out prompts with a rubric focused on real synthesis behavior
32
 
33
- Smaller family members also exist, but they should be treated as exploratory variants rather than equivalent peers.
34
 
35
- ## Main Result
36
 
37
- On a held-out rubric evaluation, the fine-tuned 8B model improved substantially over base Qwen3-8B on:
38
 
39
- - **Conditional commitment**
40
- - **Actionability**
41
- - **Resolution depth**
42
- - **Crux clarity**
43
-
44
- It also reduced weak and bad outputs, although generic hedge language is still too common.
45
-
46
- ## Read This As A Demo, Not The Whole Claim
47
-
48
- Use the Space to get a feel for the behavior.
49
-
50
- For the actual methodology and published reports, see:
51
-
52
- - model: `hikewa/dialectic-qwen3-8b-lora`
53
- - dataset + eval artifacts: `hikewa/dialectic-reasoning-traces`
54
 
55
  ## Limitations
56
 
57
  - The Space is a demo wrapper, not a research paper
58
- - Public dataset release is smaller than the full internal corpus used for the 8B model
59
- - The model can still sound diplomatic or over-general on some prompts
60
- - Stronger evidence comes from held-out evaluation, not from an isolated chat impression
 
12
 
13
  # Dialectic Reasoning
14
 
15
+ Interactive demo for the **dialectic LoRA model family**, fine-tuned to identify genuine tensions, make conditional commitments, and reach integrative resolutions instead of hedging.
16
 
17
+ ## Current Best: 4B v3
18
 
19
+ The strongest model in the family is the **[Qwen3-4B v3 LoRA](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)**:
 
 
20
 
21
+ - Trained on **507 examples** (408 original + 99 domain-diverse traces from 3 model families)
22
+ - Rubric avg: **9.8/10** — all 14 held-out prompts score "strong"
23
+ - generic_hedge: **0.00** (eliminated)
24
 
25
+ The earlier 8B model (6.6/10 on 408 traces) demonstrated that data diversity matters more than model size.
26
 
27
+ ## What This Demo Shows
28
 
29
+ - **Crux identification** — finding the real decision point
30
+ - **Conditional commitment** "if X, then Y; if Z, then W"
31
+ - **Integrative resolution** not "both sides have merit" but concrete synthesis
32
 
33
+ This is **not** a balanced conversation bot. It is a demo of a specific trained capability.
34
 
35
+ ## Evidence
36
 
37
+ For methodology and evaluation:
38
 
39
+ - Best model: [hikewa/dialectic-qwen3-4b-v3-lora](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)
40
+ - 8B model: [hikewa/dialectic-qwen3-8b-lora](https://huggingface.co/hikewa/dialectic-qwen3-8b-lora)
41
+ - Dataset + eval artifacts: [hikewa/dialectic-reasoning-traces](https://huggingface.co/datasets/hikewa/dialectic-reasoning-traces)
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ## Limitations
44
 
45
  - The Space is a demo wrapper, not a research paper
46
+ - Training data is synthetic (multi-model generated)
47
+ - English-only
48
+ - Stronger evidence comes from held-out evaluation, not from chat impressions