Image Classification
Transformers
Tibetan
tibetan
script-classification
dinov3
binary
karma689 commited on
Commit
92d18ee
·
verified ·
1 Parent(s): e50372c

Update Gyuyig vs Tsugdri binary classifier: center_crop 224 metrics, confusion matrix, training history

Browse files
Files changed (8) hide show
  1. README.md +29 -29
  2. config.yaml +6 -5
  3. confusion_matrix.json +19 -19
  4. confusion_matrix.png +0 -0
  5. final_model.pt +1 -1
  6. model_card.json +21 -21
  7. results.json +174 -240
  8. training_history.png +0 -0
README.md CHANGED
@@ -25,11 +25,11 @@ Fine-tuned [DINOv3 ViT-S](https://huggingface.co/facebook/dinov3-vits16-pretrain
25
 
26
  **Gyuyig**, **Tsugdri**
27
 
28
- **Experiment:** `dinov3_gyuyig_tsugdri_sub_warmstart` (`gyuyig_tsugdri_binary_classification`)
29
  **Pooling:** ViT **CLS token** (`last_hidden_state[:, 0, :]`)
30
  **Weights:** `final_model.pt` (best validation macro-F1 across stages A/B/C)
31
 
32
- **Warm-start:** [BDRC/4-class-balanced-script-classifier](https://huggingface.co/BDRC/4-class-balanced-script-classifier) (`final_model.pt` — prior test acc 82.6%, macro-F1 0.833)
33
 
34
  ## Data
35
 
@@ -43,56 +43,56 @@ Test split: balanced benchmark (60 images per parent class, held out of training
43
 
44
  | Split | Mode | Size |
45
  |-------|------|-----:|
46
- | train | `resize_letterbox` | 448 |
47
- | val | `resize_letterbox` | 448 |
48
- | test | `resize_letterbox` | 448 |
49
 
50
  ## Validation metrics (n=60)
51
 
52
  | Metric | Value |
53
  |--------|------:|
54
- | Accuracy | 85.0% |
55
- | Macro F1 | 0.850 |
56
- | Weighted F1 | 0.850 |
57
- | AUC-ROC | 0.902 |
58
- | Loss | 0.4520 |
59
 
60
- **Best checkpoint:** `best_stage_c_last_blocks.pt` epoch 7 val macro-F1 0.850
61
 
62
  ### Per-class (validation)
63
 
64
  ```
65
  precision recall f1-score support
66
 
67
- Gyuyig 0.89 0.80 0.84 30
68
- Tsugdri 0.82 0.90 0.86 30
69
 
70
- accuracy 0.85 60
71
- macro avg 0.85 0.85 0.85 60
72
- weighted avg 0.85 0.85 0.85 60
73
  ```
74
 
75
  ## Test / benchmark metrics (n=120)
76
 
77
  | Metric | Value |
78
  |--------|------:|
79
- | Accuracy | 80.8% |
80
- | Macro F1 | 0.808 |
81
- | Weighted F1 | 0.808 |
82
- | AUC-ROC | 0.868 |
83
- | Loss | 0.5354 |
84
 
85
  ### Per-class (test)
86
 
87
  ```
88
  precision recall f1-score support
89
 
90
- Gyuyig 0.79 0.83 0.81 60
91
- Tsugdri 0.82 0.78 0.80 60
92
 
93
- accuracy 0.81 120
94
- macro avg 0.81 0.81 0.81 120
95
- weighted avg 0.81 0.81 0.81 120
96
  ```
97
 
98
  ## Training
@@ -116,8 +116,8 @@ weighted avg 0.81 0.81 0.81 120
116
 
117
  | True \ Pred | Gyuyig | Tsugdri |
118
  |---|---:|---:|
119
- | **Gyuyig** | 50 | 10 |
120
- | **Tsugdri** | 13 | 47 |
121
 
122
  ## Files
123
 
@@ -137,7 +137,7 @@ weighted avg 0.81 0.81 0.81 120
137
 
138
  ```bash
139
  pip install -r requirements-inference.txt
140
- python inference.py --checkpoint final_model.pt --image path/to/page.jpg --preprocess resize_letterbox --preprocess-size 448
141
  ```
142
 
143
  ## Reproduce training
 
25
 
26
  **Gyuyig**, **Tsugdri**
27
 
28
+ **Experiment:** `dinov3_gyuyig_tsugdri_binary` (`gyuyig_tsugdri_binary_classification`)
29
  **Pooling:** ViT **CLS token** (`last_hidden_state[:, 0, :]`)
30
  **Weights:** `final_model.pt` (best validation macro-F1 across stages A/B/C)
31
 
32
+ **Warm-start:** [BDRC/4-class-balanced-script-classifier](https://huggingface.co/BDRC/4-class-balanced-script-classifier) (`final_model.pt` — prior test acc 92.1%, macro-F1 0.921)
33
 
34
  ## Data
35
 
 
43
 
44
  | Split | Mode | Size |
45
  |-------|------|-----:|
46
+ | train | `center_crop` | 224 |
47
+ | val | `center_crop` | 224 |
48
+ | test | `center_crop` | 224 |
49
 
50
  ## Validation metrics (n=60)
51
 
52
  | Metric | Value |
53
  |--------|------:|
54
+ | Accuracy | 91.7% |
55
+ | Macro F1 | 0.916 |
56
+ | Weighted F1 | 0.916 |
57
+ | AUC-ROC | 0.931 |
58
+ | Loss | 0.3915 |
59
 
60
+ **Best checkpoint:** `best_stage_c_last_blocks.pt` epoch 1 val macro-F1 0.916
61
 
62
  ### Per-class (validation)
63
 
64
  ```
65
  precision recall f1-score support
66
 
67
+ Gyuyig 0.88 0.97 0.92 30
68
+ Tsugdri 0.96 0.87 0.91 30
69
 
70
+ accuracy 0.92 60
71
+ macro avg 0.92 0.92 0.92 60
72
+ weighted avg 0.92 0.92 0.92 60
73
  ```
74
 
75
  ## Test / benchmark metrics (n=120)
76
 
77
  | Metric | Value |
78
  |--------|------:|
79
+ | Accuracy | 85.0% |
80
+ | Macro F1 | 0.848 |
81
+ | Weighted F1 | 0.848 |
82
+ | AUC-ROC | 0.930 |
83
+ | Loss | 0.4047 |
84
 
85
  ### Per-class (test)
86
 
87
  ```
88
  precision recall f1-score support
89
 
90
+ Gyuyig 0.78 0.97 0.87 60
91
+ Tsugdri 0.96 0.73 0.83 60
92
 
93
+ accuracy 0.85 120
94
+ macro avg 0.87 0.85 0.85 120
95
+ weighted avg 0.87 0.85 0.85 120
96
  ```
97
 
98
  ## Training
 
116
 
117
  | True \ Pred | Gyuyig | Tsugdri |
118
  |---|---:|---:|
119
+ | **Gyuyig** | 58 | 2 |
120
+ | **Tsugdri** | 16 | 44 |
121
 
122
  ## Files
123
 
 
137
 
138
  ```bash
139
  pip install -r requirements-inference.txt
140
+ python inference.py --checkpoint final_model.pt --image path/to/page.jpg --preprocess resize_letterbox --preprocess-size 224
141
  ```
142
 
143
  ## Reproduce training
config.yaml CHANGED
@@ -1,9 +1,10 @@
1
- experiment: dinov3_gyuyig_tsugdri_sub_warmstart
2
  task: gyuyig_tsugdri_binary_classification
3
 
4
  balanced_dataset_repo: BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset
5
  val_ratio: 0.15
6
 
 
7
  warmstart_repo: BDRC/4-class-balanced-script-classifier
8
  warmstart_checkpoint_file: final_model.pt
9
 
@@ -17,10 +18,10 @@ no_weighted_sampler: true
17
  skip_stage_c: false
18
  gradient_checkpointing: true
19
 
20
- train_preprocess: resize_letterbox
21
- val_preprocess: resize_letterbox
22
- test_preprocess: resize_letterbox
23
- preprocess_size: 448
24
 
25
  pooling: cls_token
26
 
 
1
+ experiment: dinov3_gyuyig_tsugdri_binary
2
  task: gyuyig_tsugdri_binary_classification
3
 
4
  balanced_dataset_repo: BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset
5
  val_ratio: 0.15
6
 
7
+
8
  warmstart_repo: BDRC/4-class-balanced-script-classifier
9
  warmstart_checkpoint_file: final_model.pt
10
 
 
18
  skip_stage_c: false
19
  gradient_checkpointing: true
20
 
21
+ train_preprocess: center_crop
22
+ val_preprocess: center_crop
23
+ test_preprocess: center_crop
24
+ preprocess_size: 224
25
 
26
  pooling: cls_token
27
 
confusion_matrix.json CHANGED
@@ -6,36 +6,36 @@
6
  ],
7
  "matrix": [
8
  [
9
- 50,
10
- 10
11
  ],
12
  [
13
- 13,
14
- 47
15
  ]
16
  ],
17
  "test_metrics": {
18
- "loss": 0.5353951374689738,
19
- "accuracy": 0.8083333333333333,
20
- "macro_f1": 0.8082134667500521,
21
- "weighted_f1": 0.8082134667500521,
22
- "auc_roc": 0.8677777777777778
23
  },
24
  "val_metrics": {
25
- "loss": 0.45204214652379354,
26
- "accuracy": 0.85,
27
- "macro_f1": 0.849624060150376,
28
- "weighted_f1": 0.849624060150376,
29
- "auc_roc": 0.9022222222222223
30
  },
31
  "preprocess": {
32
- "train": "resize_letterbox",
33
- "val": "resize_letterbox",
34
- "test": "resize_letterbox",
35
- "size": 448
36
  },
37
  "train_dataset": "BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset",
38
  "benchmark_per_parent": 60,
39
- "experiment": "dinov3_gyuyig_tsugdri_sub_warmstart",
40
  "repo_id": "BDRC/gyuyig-tsugdri-binary-script-classifier"
41
  }
 
6
  ],
7
  "matrix": [
8
  [
9
+ 58,
10
+ 2
11
  ],
12
  [
13
+ 16,
14
+ 44
15
  ]
16
  ],
17
  "test_metrics": {
18
+ "loss": 0.40474860469500223,
19
+ "accuracy": 0.85,
20
+ "macro_f1": 0.847930160518164,
21
+ "weighted_f1": 0.847930160518164,
22
+ "auc_roc": 0.9297222222222223
23
  },
24
  "val_metrics": {
25
+ "loss": 0.3914561231931051,
26
+ "accuracy": 0.9166666666666666,
27
+ "macro_f1": 0.9164578111946533,
28
+ "weighted_f1": 0.9164578111946533,
29
+ "auc_roc": 0.9311111111111111
30
  },
31
  "preprocess": {
32
+ "train": "center_crop",
33
+ "val": "center_crop",
34
+ "test": "center_crop",
35
+ "size": 224
36
  },
37
  "train_dataset": "BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset",
38
  "benchmark_per_parent": 60,
39
+ "experiment": "dinov3_gyuyig_tsugdri_binary",
40
  "repo_id": "BDRC/gyuyig-tsugdri-binary-script-classifier"
41
  }
confusion_matrix.png CHANGED
final_model.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6753c4b49f6fa52b6d4580b926070cdea8dba7908f0e3a562f60dd42512e3148
3
  size 86670182
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9cce9a6920cff9cd5fc7add0a40b81969f36383fc1919fcb4c0adb1b0f2047b1
3
  size 86670182
model_card.json CHANGED
@@ -3,27 +3,27 @@
3
  "train_dataset_id": "BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset",
4
  "warmstart_repo": "BDRC/4-class-balanced-script-classifier",
5
  "task": "gyuyig_tsugdri_binary_classification",
6
- "experiment": "dinov3_gyuyig_tsugdri_sub_warmstart",
7
  "classes": [
8
  "Gyuyig",
9
  "Tsugdri"
10
  ],
11
  "pooling": "cls_token",
12
  "preprocess": {
13
- "train": "resize_letterbox",
14
- "val": "resize_letterbox",
15
- "test": "resize_letterbox",
16
- "size": 448
17
  },
18
  "warmstart": {
19
  "warmstart_repo": "BDRC/4-class-balanced-script-classifier",
20
  "warmstart_checkpoint": null,
21
  "warmstart_checkpoint_file": "final_model.pt",
22
  "checkpoint_test_metrics": {
23
- "loss": 0.6574946736847913,
24
- "accuracy": 0.825925925925926,
25
- "macro_f1": 0.8326187473728457,
26
- "weighted_f1": 0.82908384875598
27
  },
28
  "warmstart_pooling": "cls_token"
29
  },
@@ -66,21 +66,21 @@
66
  },
67
  "best_checkpoint": {
68
  "path": "best_stage_c_last_blocks.pt",
69
- "epoch": 7,
70
- "val_macro_f1": 0.849624060150376
71
  },
72
  "val_metrics": {
73
- "loss": 0.45204214652379354,
74
- "accuracy": 0.85,
75
- "macro_f1": 0.849624060150376,
76
- "weighted_f1": 0.849624060150376,
77
- "auc_roc": 0.9022222222222223
78
  },
79
  "test_metrics": {
80
- "loss": 0.5353951374689738,
81
- "accuracy": 0.8083333333333333,
82
- "macro_f1": 0.8082134667500521,
83
- "weighted_f1": 0.8082134667500521,
84
- "auc_roc": 0.8677777777777778
85
  }
86
  }
 
3
  "train_dataset_id": "BDRC/gyuyig-tsugdri-binary-balanced-script-classification-dataset",
4
  "warmstart_repo": "BDRC/4-class-balanced-script-classifier",
5
  "task": "gyuyig_tsugdri_binary_classification",
6
+ "experiment": "dinov3_gyuyig_tsugdri_binary",
7
  "classes": [
8
  "Gyuyig",
9
  "Tsugdri"
10
  ],
11
  "pooling": "cls_token",
12
  "preprocess": {
13
+ "train": "center_crop",
14
+ "val": "center_crop",
15
+ "test": "center_crop",
16
+ "size": 224
17
  },
18
  "warmstart": {
19
  "warmstart_repo": "BDRC/4-class-balanced-script-classifier",
20
  "warmstart_checkpoint": null,
21
  "warmstart_checkpoint_file": "final_model.pt",
22
  "checkpoint_test_metrics": {
23
+ "loss": 0.42663901050885517,
24
+ "accuracy": 0.9208333333333333,
25
+ "macro_f1": 0.9208489161207983,
26
+ "weighted_f1": 0.9208489161207983
27
  },
28
  "warmstart_pooling": "cls_token"
29
  },
 
66
  },
67
  "best_checkpoint": {
68
  "path": "best_stage_c_last_blocks.pt",
69
+ "epoch": 1,
70
+ "val_macro_f1": 0.9164578111946533
71
  },
72
  "val_metrics": {
73
+ "loss": 0.3914561231931051,
74
+ "accuracy": 0.9166666666666666,
75
+ "macro_f1": 0.9164578111946533,
76
+ "weighted_f1": 0.9164578111946533,
77
+ "auc_roc": 0.9311111111111111
78
  },
79
  "test_metrics": {
80
+ "loss": 0.40474860469500223,
81
+ "accuracy": 0.85,
82
+ "macro_f1": 0.847930160518164,
83
+ "weighted_f1": 0.847930160518164,
84
+ "auc_roc": 0.9297222222222223
85
  }
86
  }
results.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "experiment": "dinov3_gyuyig_tsugdri_sub_warmstart",
3
  "run_subdir": null,
4
  "task": "gyuyig_tsugdri_binary_classification",
5
  "balanced_parquet_dir": null,
@@ -8,10 +8,10 @@
8
  "benchmark_dataset_repo": null,
9
  "benchmark_per_parent": 60,
10
  "preprocess": {
11
- "train": "resize_letterbox",
12
- "val": "resize_letterbox",
13
- "test": "resize_letterbox",
14
- "size": 448
15
  },
16
  "pooling": "cls_token",
17
  "training_config": {
@@ -60,10 +60,10 @@
60
  "warmstart_checkpoint": null,
61
  "warmstart_checkpoint_file": "final_model.pt",
62
  "checkpoint_test_metrics": {
63
- "loss": 0.6574946736847913,
64
- "accuracy": 0.825925925925926,
65
- "macro_f1": 0.8326187473728457,
66
- "weighted_f1": 0.82908384875598
67
  },
68
  "warmstart_pooling": "cls_token"
69
  },
@@ -72,376 +72,310 @@
72
  },
73
  "best_checkpoint": {
74
  "path": "best_stage_c_last_blocks.pt",
75
- "epoch": 7,
76
- "val_macro_f1": 0.849624060150376
77
  },
78
  "val_metrics": {
79
- "loss": 0.45204214652379354,
80
- "accuracy": 0.85,
81
- "macro_f1": 0.849624060150376,
82
- "weighted_f1": 0.849624060150376,
83
- "auc_roc": 0.9022222222222223
84
  },
85
  "val_confusion_matrix": [
86
  [
87
- 24,
88
- 6
89
  ],
90
  [
91
- 3,
92
- 27
93
  ]
94
  ],
95
- "val_report": " precision recall f1-score support\n\n Gyuyig 0.89 0.80 0.84 30\n Tsugdri 0.82 0.90 0.86 30\n\n accuracy 0.85 60\n macro avg 0.85 0.85 0.85 60\nweighted avg 0.85 0.85 0.85 60\n",
96
  "test_metrics": {
97
- "loss": 0.5353951374689738,
98
- "accuracy": 0.8083333333333333,
99
- "macro_f1": 0.8082134667500521,
100
- "weighted_f1": 0.8082134667500521,
101
- "auc_roc": 0.8677777777777778
102
  },
103
  "test_confusion_matrix": [
104
  [
105
- 50,
106
- 10
107
  ],
108
  [
109
- 13,
110
- 47
111
  ]
112
  ],
113
- "test_report": " precision recall f1-score support\n\n Gyuyig 0.79 0.83 0.81 60\n Tsugdri 0.82 0.78 0.80 60\n\n accuracy 0.81 120\n macro avg 0.81 0.81 0.81 120\nweighted avg 0.81 0.81 0.81 120\n",
114
  "history": {
115
  "stage_a": [
116
  {
117
  "epoch": 1,
118
- "train_loss": 0.868421577271961,
119
- "train_acc": 0.39880952380952384,
120
  "lr_head": 5e-06,
121
- "val_loss": 0.8907806555430094,
122
- "val_accuracy": 0.3,
123
- "val_macro_f1": 0.28,
124
- "val_weighted_f1": 0.28
125
  },
126
  {
127
  "epoch": 2,
128
- "train_loss": 0.6925251398767743,
129
- "train_acc": 0.5744047619047619,
130
  "lr_head": 0.0005,
131
- "val_loss": 0.6001647631327311,
132
- "val_accuracy": 0.6166666666666667,
133
- "val_macro_f1": 0.5911111111111111,
134
- "val_weighted_f1": 0.5911111111111111
135
  },
136
  {
137
  "epoch": 3,
138
- "train_loss": 0.5732808538845607,
139
- "train_acc": 0.6964285714285714,
140
  "lr_head": 0.0004668412874366486,
141
- "val_loss": 0.5495736837387085,
142
- "val_accuracy": 0.7166666666666667,
143
- "val_macro_f1": 0.7165879410947485,
144
- "val_weighted_f1": 0.7165879410947485
145
  },
146
  {
147
  "epoch": 4,
148
- "train_loss": 0.5409448828016009,
149
- "train_acc": 0.7351190476190477,
150
  "lr_head": 0.00037624999999999996,
151
- "val_loss": 0.5596829652786255,
152
- "val_accuracy": 0.7166666666666667,
153
- "val_macro_f1": 0.7127569698676429,
154
- "val_weighted_f1": 0.7127569698676429
155
  },
156
  {
157
  "epoch": 5,
158
- "train_loss": 0.517706477925891,
159
- "train_acc": 0.7380952380952381,
160
  "lr_head": 0.0002525,
161
- "val_loss": 0.570094374815623,
162
- "val_accuracy": 0.6833333333333333,
163
- "val_macro_f1": 0.6723196320781835,
164
- "val_weighted_f1": 0.6723196320781833
165
  },
166
  {
167
  "epoch": 6,
168
- "train_loss": 0.5244638267017546,
169
- "train_acc": 0.7529761904761905,
170
  "lr_head": 0.00012875000000000007,
171
- "val_loss": 0.5446342428525289,
172
- "val_accuracy": 0.7166666666666667,
173
- "val_macro_f1": 0.7159565580618212,
174
- "val_weighted_f1": 0.7159565580618211
175
  },
176
  {
177
  "epoch": 7,
178
- "train_loss": 0.49209926100004286,
179
- "train_acc": 0.7797619047619048,
180
  "lr_head": 3.815871256335142e-05,
181
- "val_loss": 0.5444834152857463,
182
- "val_accuracy": 0.7166666666666667,
183
- "val_macro_f1": 0.7159565580618212,
184
- "val_weighted_f1": 0.7159565580618211
185
  }
186
  ],
187
  "stage_b": [
188
  {
189
  "epoch": 1,
190
- "train_loss": 0.5235937989893413,
191
- "train_acc": 0.75,
192
  "lr_head": 1.0000000000000002e-06,
193
  "lr_backbone": 1.0000000000000001e-07,
194
- "val_loss": 0.5491747617721557,
195
- "val_accuracy": 0.7166666666666667,
196
- "val_macro_f1": 0.7165879410947485,
197
- "val_weighted_f1": 0.7165879410947485
198
  },
199
  {
200
  "epoch": 2,
201
- "train_loss": 0.5453660388787588,
202
- "train_acc": 0.7529761904761905,
203
  "lr_head": 0.0001,
204
  "lr_backbone": 1e-05,
205
- "val_loss": 0.5492973128954569,
206
- "val_accuracy": 0.7333333333333333,
207
- "val_macro_f1": 0.7306397306397306,
208
- "val_weighted_f1": 0.7306397306397305
209
  },
210
  {
211
  "epoch": 3,
212
- "train_loss": 0.49581416731788996,
213
- "train_acc": 0.7916666666666666,
214
  "lr_head": 9.701478472890248e-05,
215
  "lr_backbone": 9.701478472890248e-06,
216
- "val_loss": 0.538078244527181,
217
- "val_accuracy": 0.7166666666666667,
218
- "val_macro_f1": 0.7165879410947485,
219
- "val_weighted_f1": 0.7165879410947485
220
  },
221
  {
222
  "epoch": 4,
223
- "train_loss": 0.4958783601011549,
224
- "train_acc": 0.7678571428571429,
225
  "lr_head": 8.84191999343894e-05,
226
  "lr_backbone": 8.841919993438941e-06,
227
- "val_loss": 0.5364969690640767,
228
- "val_accuracy": 0.7333333333333333,
229
- "val_macro_f1": 0.7321428571428572,
230
- "val_weighted_f1": 0.7321428571428572
231
  },
232
  {
233
  "epoch": 5,
234
- "train_loss": 0.4718715477557409,
235
- "train_acc": 0.7946428571428571,
236
  "lr_head": 7.525e-05,
237
  "lr_backbone": 7.525e-06,
238
- "val_loss": 0.5267467339833577,
239
- "val_accuracy": 0.7333333333333333,
240
- "val_macro_f1": 0.7321428571428572,
241
- "val_weighted_f1": 0.7321428571428572
242
  },
243
  {
244
  "epoch": 6,
245
- "train_loss": 0.4493213253361838,
246
- "train_acc": 0.8244047619047619,
247
  "lr_head": 5.909558479451306e-05,
248
  "lr_backbone": 5.909558479451306e-06,
249
- "val_loss": 0.5186658461888631,
250
- "val_accuracy": 0.7166666666666667,
251
- "val_macro_f1": 0.7159565580618212,
252
- "val_weighted_f1": 0.7159565580618211
253
  },
254
  {
255
  "epoch": 7,
256
- "train_loss": 0.43445002465021043,
257
- "train_acc": 0.8184523809523809,
258
  "lr_head": 4.190441520548695e-05,
259
  "lr_backbone": 4.190441520548696e-06,
260
- "val_loss": 0.5154884020487468,
261
- "val_accuracy": 0.7166666666666667,
262
- "val_macro_f1": 0.7159565580618212,
263
- "val_weighted_f1": 0.7159565580618211
264
  },
265
  {
266
  "epoch": 8,
267
- "train_loss": 0.42955496197655085,
268
- "train_acc": 0.8273809523809523,
269
  "lr_head": 2.5750000000000013e-05,
270
  "lr_backbone": 2.575000000000001e-06,
271
- "val_loss": 0.5194342851638794,
272
- "val_accuracy": 0.7666666666666667,
273
- "val_macro_f1": 0.7643097643097643,
274
- "val_weighted_f1": 0.7643097643097643
275
  },
276
  {
277
  "epoch": 9,
278
- "train_loss": 0.4368310116586231,
279
- "train_acc": 0.8154761904761905,
280
  "lr_head": 1.2580800065610596e-05,
281
  "lr_backbone": 1.2580800065610596e-06,
282
- "val_loss": 0.5142494519551595,
283
- "val_accuracy": 0.7333333333333333,
284
- "val_macro_f1": 0.7321428571428572,
285
- "val_weighted_f1": 0.7321428571428572
286
- },
287
- {
288
- "epoch": 10,
289
- "train_loss": 0.4370355563504355,
290
- "train_acc": 0.8125,
291
- "lr_head": 3.985215271097539e-06,
292
- "lr_backbone": 3.985215271097539e-07,
293
- "val_loss": 0.5129753907521566,
294
- "val_accuracy": 0.7333333333333333,
295
- "val_macro_f1": 0.7321428571428572,
296
- "val_weighted_f1": 0.7321428571428572
297
  }
298
  ],
299
  "stage_c": [
300
  {
301
  "epoch": 1,
302
- "train_loss": 0.4291627889587766,
303
- "train_acc": 0.8273809523809523,
304
  "lr_head": 5.000000000000001e-07,
305
  "lr_backbone": 1.5000000000000002e-07,
306
- "val_loss": 0.5183799107869466,
307
- "val_accuracy": 0.7666666666666667,
308
- "val_macro_f1": 0.7643097643097643,
309
- "val_weighted_f1": 0.7643097643097643
310
  },
311
  {
312
  "epoch": 2,
313
- "train_loss": 0.42413380600157236,
314
- "train_acc": 0.8154761904761905,
315
  "lr_head": 5e-05,
316
  "lr_backbone": 1.5e-05,
317
- "val_loss": 0.5204356749852498,
318
- "val_accuracy": 0.7166666666666667,
319
- "val_macro_f1": 0.7101449275362319,
320
- "val_weighted_f1": 0.7101449275362319
321
  },
322
  {
323
  "epoch": 3,
324
- "train_loss": 0.37433484338578726,
325
- "train_acc": 0.875,
326
  "lr_head": 4.899745109695881e-05,
327
  "lr_backbone": 1.4699235329087644e-05,
328
- "val_loss": 0.48167786995569867,
329
- "val_accuracy": 0.7666666666666667,
330
- "val_macro_f1": 0.7643097643097643,
331
- "val_weighted_f1": 0.7643097643097643
332
  },
333
  {
334
  "epoch": 4,
335
- "train_loss": 0.335602682970819,
336
- "train_acc": 0.8839285714285714,
337
  "lr_head": 4.6071024937571735e-05,
338
  "lr_backbone": 1.3821307481271522e-05,
339
- "val_loss": 0.45847638448079425,
340
- "val_accuracy": 0.8166666666666667,
341
- "val_macro_f1": 0.8153846153846154,
342
- "val_weighted_f1": 0.8153846153846154
343
  },
344
  {
345
  "epoch": 5,
346
- "train_loss": 0.2807346156665257,
347
- "train_acc": 0.9255952380952381,
348
  "lr_head": 4.145780316514581e-05,
349
  "lr_backbone": 1.2437340949543742e-05,
350
- "val_loss": 0.4557582457860311,
351
- "val_accuracy": 0.8,
352
- "val_macro_f1": 0.7991071428571428,
353
- "val_weighted_f1": 0.7991071428571428
354
  },
355
  {
356
  "epoch": 6,
357
- "train_loss": 0.2691161029395603,
358
- "train_acc": 0.9375,
359
  "lr_head": 3.5531521571796694e-05,
360
  "lr_backbone": 1.0659456471539008e-05,
361
- "val_loss": 0.44928998152414956,
362
- "val_accuracy": 0.8333333333333334,
363
- "val_macro_f1": 0.8331479421579533,
364
- "val_weighted_f1": 0.8331479421579533
365
  },
366
  {
367
  "epoch": 7,
368
- "train_loss": 0.23485895210788363,
369
- "train_acc": 0.9375,
370
  "lr_head": 2.877229224726381e-05,
371
  "lr_backbone": 8.631687674179142e-06,
372
- "val_loss": 0.45204214652379354,
373
- "val_accuracy": 0.85,
374
- "val_macro_f1": 0.849624060150376,
375
- "val_weighted_f1": 0.849624060150376
376
- },
377
- {
378
- "epoch": 8,
379
- "train_loss": 0.22606890329292842,
380
- "train_acc": 0.9553571428571429,
381
- "lr_head": 2.1727707752736196e-05,
382
- "lr_backbone": 6.518312325820858e-06,
383
- "val_loss": 0.4554001530011495,
384
- "val_accuracy": 0.7833333333333333,
385
- "val_macro_f1": 0.783273131425396,
386
- "val_weighted_f1": 0.783273131425396
387
- },
388
- {
389
- "epoch": 9,
390
- "train_loss": 0.19964813192685446,
391
- "train_acc": 0.9672619047619048,
392
- "lr_head": 1.4968478428203314e-05,
393
- "lr_backbone": 4.490543528460994e-06,
394
- "val_loss": 0.45547234614690146,
395
- "val_accuracy": 0.8333333333333334,
396
- "val_macro_f1": 0.8325892857142857,
397
- "val_weighted_f1": 0.8325892857142857
398
- },
399
- {
400
- "epoch": 10,
401
- "train_loss": 0.21034001807371774,
402
- "train_acc": 0.9583333333333334,
403
- "lr_head": 9.042196834854196e-06,
404
- "lr_backbone": 2.712659050456259e-06,
405
- "val_loss": 0.45674578348795575,
406
- "val_accuracy": 0.8333333333333334,
407
- "val_macro_f1": 0.8325892857142857,
408
- "val_weighted_f1": 0.8325892857142857
409
- },
410
- {
411
- "epoch": 11,
412
- "train_loss": 0.1949910166717711,
413
- "train_acc": 0.9702380952380952,
414
- "lr_head": 4.428975062428262e-06,
415
- "lr_backbone": 1.3286925187284787e-06,
416
- "val_loss": 0.4595174193382263,
417
- "val_accuracy": 0.8166666666666667,
418
- "val_macro_f1": 0.8153846153846154,
419
- "val_weighted_f1": 0.8153846153846154
420
- },
421
- {
422
- "epoch": 12,
423
- "train_loss": 0.1842061841771716,
424
- "train_acc": 0.9761904761904762,
425
- "lr_head": 1.502548903041193e-06,
426
- "lr_backbone": 4.5076467091235787e-07,
427
- "val_loss": 0.4608180999755859,
428
- "val_accuracy": 0.8166666666666667,
429
- "val_macro_f1": 0.8153846153846154,
430
- "val_weighted_f1": 0.8153846153846154
431
  }
432
  ]
433
  },
434
  "confusion_matrix": [
435
  [
436
- 50,
437
- 10
438
  ],
439
  [
440
- 13,
441
- 47
442
  ]
443
  ],
444
- "report": " precision recall f1-score support\n\n Gyuyig 0.79 0.83 0.81 60\n Tsugdri 0.82 0.78 0.80 60\n\n accuracy 0.81 120\n macro avg 0.81 0.81 0.81 120\nweighted avg 0.81 0.81 0.81 120\n",
445
  "idx_to_label": {
446
  "0": "Gyuyig",
447
  "1": "Tsugdri"
 
1
  {
2
+ "experiment": "dinov3_gyuyig_tsugdri_binary",
3
  "run_subdir": null,
4
  "task": "gyuyig_tsugdri_binary_classification",
5
  "balanced_parquet_dir": null,
 
8
  "benchmark_dataset_repo": null,
9
  "benchmark_per_parent": 60,
10
  "preprocess": {
11
+ "train": "center_crop",
12
+ "val": "center_crop",
13
+ "test": "center_crop",
14
+ "size": 224
15
  },
16
  "pooling": "cls_token",
17
  "training_config": {
 
60
  "warmstart_checkpoint": null,
61
  "warmstart_checkpoint_file": "final_model.pt",
62
  "checkpoint_test_metrics": {
63
+ "loss": 0.42663901050885517,
64
+ "accuracy": 0.9208333333333333,
65
+ "macro_f1": 0.9208489161207983,
66
+ "weighted_f1": 0.9208489161207983
67
  },
68
  "warmstart_pooling": "cls_token"
69
  },
 
72
  },
73
  "best_checkpoint": {
74
  "path": "best_stage_c_last_blocks.pt",
75
+ "epoch": 1,
76
+ "val_macro_f1": 0.9164578111946533
77
  },
78
  "val_metrics": {
79
+ "loss": 0.3914561231931051,
80
+ "accuracy": 0.9166666666666666,
81
+ "macro_f1": 0.9164578111946533,
82
+ "weighted_f1": 0.9164578111946533,
83
+ "auc_roc": 0.9311111111111111
84
  },
85
  "val_confusion_matrix": [
86
  [
87
+ 29,
88
+ 1
89
  ],
90
  [
91
+ 4,
92
+ 26
93
  ]
94
  ],
95
+ "val_report": " precision recall f1-score support\n\n Gyuyig 0.88 0.97 0.92 30\n Tsugdri 0.96 0.87 0.91 30\n\n accuracy 0.92 60\n macro avg 0.92 0.92 0.92 60\nweighted avg 0.92 0.92 0.92 60\n",
96
  "test_metrics": {
97
+ "loss": 0.40474860469500223,
98
+ "accuracy": 0.85,
99
+ "macro_f1": 0.847930160518164,
100
+ "weighted_f1": 0.847930160518164,
101
+ "auc_roc": 0.9297222222222223
102
  },
103
  "test_confusion_matrix": [
104
  [
105
+ 58,
106
+ 2
107
  ],
108
  [
109
+ 16,
110
+ 44
111
  ]
112
  ],
113
+ "test_report": " precision recall f1-score support\n\n Gyuyig 0.78 0.97 0.87 60\n Tsugdri 0.96 0.73 0.83 60\n\n accuracy 0.85 120\n macro avg 0.87 0.85 0.85 120\nweighted avg 0.87 0.85 0.85 120\n",
114
  "history": {
115
  "stage_a": [
116
  {
117
  "epoch": 1,
118
+ "train_loss": 0.8516533403169542,
119
+ "train_acc": 0.4017857142857143,
120
  "lr_head": 5e-06,
121
+ "val_loss": 0.8099705815315247,
122
+ "val_accuracy": 0.36666666666666664,
123
+ "val_macro_f1": 0.36666666666666664,
124
+ "val_weighted_f1": 0.36666666666666664
125
  },
126
  {
127
  "epoch": 2,
128
+ "train_loss": 0.6583722290538606,
129
+ "train_acc": 0.6309523809523809,
130
  "lr_head": 0.0005,
131
+ "val_loss": 0.6123360713322957,
132
+ "val_accuracy": 0.65,
133
+ "val_macro_f1": 0.6378269617706237,
134
+ "val_weighted_f1": 0.6378269617706238
135
  },
136
  {
137
  "epoch": 3,
138
+ "train_loss": 0.5344764121941158,
139
+ "train_acc": 0.7291666666666666,
140
  "lr_head": 0.0004668412874366486,
141
+ "val_loss": 0.5156712194283803,
142
+ "val_accuracy": 0.7833333333333333,
143
+ "val_macro_f1": 0.7827903091060986,
144
+ "val_weighted_f1": 0.7827903091060985
145
  },
146
  {
147
  "epoch": 4,
148
+ "train_loss": 0.4900844920249212,
149
+ "train_acc": 0.7529761904761905,
150
  "lr_head": 0.00037624999999999996,
151
+ "val_loss": 0.45860581994056704,
152
+ "val_accuracy": 0.7666666666666667,
153
+ "val_macro_f1": 0.7624434389140271,
154
+ "val_weighted_f1": 0.7624434389140272
155
  },
156
  {
157
  "epoch": 5,
158
+ "train_loss": 0.4637424250443776,
159
+ "train_acc": 0.7857142857142857,
160
  "lr_head": 0.0002525,
161
+ "val_loss": 0.43475709557533265,
162
+ "val_accuracy": 0.7666666666666667,
163
+ "val_macro_f1": 0.7624434389140271,
164
+ "val_weighted_f1": 0.7624434389140272
165
  },
166
  {
167
  "epoch": 6,
168
+ "train_loss": 0.4428629179795583,
169
+ "train_acc": 0.7827380952380952,
170
  "lr_head": 0.00012875000000000007,
171
+ "val_loss": 0.4212049206097921,
172
+ "val_accuracy": 0.85,
173
+ "val_macro_f1": 0.8499583217560434,
174
+ "val_weighted_f1": 0.8499583217560432
175
  },
176
  {
177
  "epoch": 7,
178
+ "train_loss": 0.4315942114307767,
179
+ "train_acc": 0.8095238095238095,
180
  "lr_head": 3.815871256335142e-05,
181
+ "val_loss": 0.4180481950441996,
182
+ "val_accuracy": 0.85,
183
+ "val_macro_f1": 0.8499583217560434,
184
+ "val_weighted_f1": 0.8499583217560432
185
  }
186
  ],
187
  "stage_b": [
188
  {
189
  "epoch": 1,
190
+ "train_loss": 0.42322553339458646,
191
+ "train_acc": 0.8125,
192
  "lr_head": 1.0000000000000002e-06,
193
  "lr_backbone": 1.0000000000000001e-07,
194
+ "val_loss": 0.42059502998987836,
195
+ "val_accuracy": 0.8666666666666667,
196
+ "val_macro_f1": 0.8666666666666667,
197
+ "val_weighted_f1": 0.8666666666666667
198
  },
199
  {
200
  "epoch": 2,
201
+ "train_loss": 0.43337753273191904,
202
+ "train_acc": 0.8095238095238095,
203
  "lr_head": 0.0001,
204
  "lr_backbone": 1e-05,
205
+ "val_loss": 0.4085062007109324,
206
+ "val_accuracy": 0.8666666666666667,
207
+ "val_macro_f1": 0.8666666666666667,
208
+ "val_weighted_f1": 0.8666666666666667
209
  },
210
  {
211
  "epoch": 3,
212
+ "train_loss": 0.4347181845278967,
213
+ "train_acc": 0.7886904761904762,
214
  "lr_head": 9.701478472890248e-05,
215
  "lr_backbone": 9.701478472890248e-06,
216
+ "val_loss": 0.3922509431838989,
217
+ "val_accuracy": 0.9166666666666666,
218
+ "val_macro_f1": 0.9164578111946533,
219
+ "val_weighted_f1": 0.9164578111946533
220
  },
221
  {
222
  "epoch": 4,
223
+ "train_loss": 0.4287990019434974,
224
+ "train_acc": 0.7708333333333334,
225
  "lr_head": 8.84191999343894e-05,
226
  "lr_backbone": 8.841919993438941e-06,
227
+ "val_loss": 0.3846385677655538,
228
+ "val_accuracy": 0.85,
229
+ "val_macro_f1": 0.849624060150376,
230
+ "val_weighted_f1": 0.849624060150376
231
  },
232
  {
233
  "epoch": 5,
234
+ "train_loss": 0.40112980774470736,
235
+ "train_acc": 0.8214285714285714,
236
  "lr_head": 7.525e-05,
237
  "lr_backbone": 7.525e-06,
238
+ "val_loss": 0.3696295181910197,
239
+ "val_accuracy": 0.9,
240
+ "val_macro_f1": 0.899888765294772,
241
+ "val_weighted_f1": 0.899888765294772
242
  },
243
  {
244
  "epoch": 6,
245
+ "train_loss": 0.3798623964900062,
246
+ "train_acc": 0.8422619047619048,
247
  "lr_head": 5.909558479451306e-05,
248
  "lr_backbone": 5.909558479451306e-06,
249
+ "val_loss": 0.365932967265447,
250
+ "val_accuracy": 0.9,
251
+ "val_macro_f1": 0.899888765294772,
252
+ "val_weighted_f1": 0.899888765294772
253
  },
254
  {
255
  "epoch": 7,
256
+ "train_loss": 0.3871297893070039,
257
+ "train_acc": 0.8214285714285714,
258
  "lr_head": 4.190441520548695e-05,
259
  "lr_backbone": 4.190441520548696e-06,
260
+ "val_loss": 0.3574748694896698,
261
+ "val_accuracy": 0.8833333333333333,
262
+ "val_macro_f1": 0.883300916921367,
263
+ "val_weighted_f1": 0.883300916921367
264
  },
265
  {
266
  "epoch": 8,
267
+ "train_loss": 0.39022268142019,
268
+ "train_acc": 0.8005952380952381,
269
  "lr_head": 2.5750000000000013e-05,
270
  "lr_backbone": 2.575000000000001e-06,
271
+ "val_loss": 0.3670339067776998,
272
+ "val_accuracy": 0.85,
273
+ "val_macro_f1": 0.849624060150376,
274
+ "val_weighted_f1": 0.849624060150376
275
  },
276
  {
277
  "epoch": 9,
278
+ "train_loss": 0.3902270041760944,
279
+ "train_acc": 0.7976190476190477,
280
  "lr_head": 1.2580800065610596e-05,
281
  "lr_backbone": 1.2580800065610596e-06,
282
+ "val_loss": 0.3644895474116007,
283
+ "val_accuracy": 0.8666666666666667,
284
+ "val_macro_f1": 0.8665183537263625,
285
+ "val_weighted_f1": 0.8665183537263625
 
 
 
 
 
 
 
 
 
 
 
286
  }
287
  ],
288
  "stage_c": [
289
  {
290
  "epoch": 1,
291
+ "train_loss": 0.4103491987500872,
292
+ "train_acc": 0.8125,
293
  "lr_head": 5.000000000000001e-07,
294
  "lr_backbone": 1.5000000000000002e-07,
295
+ "val_loss": 0.3914561231931051,
296
+ "val_accuracy": 0.9166666666666666,
297
+ "val_macro_f1": 0.9164578111946533,
298
+ "val_weighted_f1": 0.9164578111946533
299
  },
300
  {
301
  "epoch": 2,
302
+ "train_loss": 0.4436623099304381,
303
+ "train_acc": 0.7678571428571429,
304
  "lr_head": 5e-05,
305
  "lr_backbone": 1.5e-05,
306
+ "val_loss": 0.3777097463607788,
307
+ "val_accuracy": 0.9166666666666666,
308
+ "val_macro_f1": 0.9164578111946533,
309
+ "val_weighted_f1": 0.9164578111946533
310
  },
311
  {
312
  "epoch": 3,
313
+ "train_loss": 0.3808557249250866,
314
+ "train_acc": 0.8363095238095238,
315
  "lr_head": 4.899745109695881e-05,
316
  "lr_backbone": 1.4699235329087644e-05,
317
+ "val_loss": 0.3577085534731547,
318
+ "val_accuracy": 0.8666666666666667,
319
+ "val_macro_f1": 0.8665183537263625,
320
+ "val_weighted_f1": 0.8665183537263625
321
  },
322
  {
323
  "epoch": 4,
324
+ "train_loss": 0.3782187302907308,
325
+ "train_acc": 0.8392857142857143,
326
  "lr_head": 4.6071024937571735e-05,
327
  "lr_backbone": 1.3821307481271522e-05,
328
+ "val_loss": 0.3452535013357798,
329
+ "val_accuracy": 0.8833333333333333,
330
+ "val_macro_f1": 0.883300916921367,
331
+ "val_weighted_f1": 0.883300916921367
332
  },
333
  {
334
  "epoch": 5,
335
+ "train_loss": 0.38378763056936716,
336
+ "train_acc": 0.8392857142857143,
337
  "lr_head": 4.145780316514581e-05,
338
  "lr_backbone": 1.2437340949543742e-05,
339
+ "val_loss": 0.33214412728945414,
340
+ "val_accuracy": 0.8833333333333333,
341
+ "val_macro_f1": 0.883300916921367,
342
+ "val_weighted_f1": 0.883300916921367
343
  },
344
  {
345
  "epoch": 6,
346
+ "train_loss": 0.3539558429093588,
347
+ "train_acc": 0.8571428571428571,
348
  "lr_head": 3.5531521571796694e-05,
349
  "lr_backbone": 1.0659456471539008e-05,
350
+ "val_loss": 0.3277231236298879,
351
+ "val_accuracy": 0.8833333333333333,
352
+ "val_macro_f1": 0.883300916921367,
353
+ "val_weighted_f1": 0.883300916921367
354
  },
355
  {
356
  "epoch": 7,
357
+ "train_loss": 0.35197025750364574,
358
+ "train_acc": 0.8511904761904762,
359
  "lr_head": 2.877229224726381e-05,
360
  "lr_backbone": 8.631687674179142e-06,
361
+ "val_loss": 0.3296816150347392,
362
+ "val_accuracy": 0.8666666666666667,
363
+ "val_macro_f1": 0.8665183537263625,
364
+ "val_weighted_f1": 0.8665183537263625
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
365
  }
366
  ]
367
  },
368
  "confusion_matrix": [
369
  [
370
+ 58,
371
+ 2
372
  ],
373
  [
374
+ 16,
375
+ 44
376
  ]
377
  ],
378
+ "report": " precision recall f1-score support\n\n Gyuyig 0.78 0.97 0.87 60\n Tsugdri 0.96 0.73 0.83 60\n\n accuracy 0.85 120\n macro avg 0.87 0.85 0.85 120\nweighted avg 0.87 0.85 0.85 120\n",
379
  "idx_to_label": {
380
  "0": "Gyuyig",
381
  "1": "Tsugdri"
training_history.png CHANGED