Instructions to use dplotnikov/stratabert-tiny-ag-news-smoke with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dplotnikov/stratabert-tiny-ag-news-smoke with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="dplotnikov/stratabert-tiny-ag-news-smoke", trust_remote_code=True)# Load model directly from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("dplotnikov/stratabert-tiny-ag-news-smoke", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
stratabert-tiny-smoke
Model Summary
This is a StrataBERT diagnostic checkpoint from run_001. Claim status: diagnostic_only. It is not a release-quality checkpoint and must not be used for public quality or efficiency claims.
Architecture
tokens -> embeddings -> [global attention / bidirectional SSM / local attention]* -> mask-aware pooling -> task head
Architecture class: StrataBertForSequenceClassification. Layer types: ['global_attention', 'ssm', 'local_attention']. Hidden size: 48. Max positions: 128.
Parameter Count
Total parameters: 2498404.
Training Data
Data artifacts:
train_index:data/eval_frozen/run_001/ag_news_train_index_sample64.jsoneval_index:data/eval_frozen/run_001/ag_news_eval_index_sample200.json
Raw text is not embedded in this card or the frozen eval indices.
Objective Mix
task: 1.0
Teacher Models
No teacher model is used for this checkpoint.
Licenses
Project code license: MIT. Dataset audit summary:
ag_news_v001:restricted_noncommercial_unclear; No standard permissive license is declared.arxiv_classification_v001:needs_review_full_text_rights; Selected HF repo does not declare a data license.bc5cdr_v001:needs_review_bc5cdr_tner_mirror; No source-license research entry is present; manifest note: Canonical bigbio/bc5cdr script is disabled by current datasets versions; executable manifest uses TNER BC5CDR converted parquet.conll2003_v001:restricted_avoid_publication_claims; Highest-risk MVP dataset because the source text is Reuters copyrighted newswire.eurlex57k_v001:needs_review_lexglue_eurlex; No source-license research entry is present; manifest note: HF datasets metadata inspected with datasets.load_dataset_builder('coastalcph/lex_glue', 'eurlex') on 2026-06-10.hyperpartisan_news_v001:needs_review_hyperpartisan_mirror; No source-license research entry is present; manifest note: HF parquet metadata inspected on 2026-06-10 via jonathanli/hyperpartisan-longformer-split.imdb_v001:restricted_noncommercial_unclear; HF license tag is other rather than a permissive license.openpii_1m_v001:approved_cc_by_4_0_attribution_required; No source-license research entry is present; manifest note: HF datasets metadata inspected with datasets.load_dataset_builder('ai4privacy/pii-masking-openpii-1m', 'default') on 2026-06-10.patent_classification_v001:needs_review_mirror_license; The selected ccdv sample repo does not declare its own license.pubmed_200k_rct_v001:needs_review_pubmed_rct_mirror; No source-license research entry is present; manifest note: HF parquet metadata inspected on 2026-06-10.scicite_v001:needs_review_allenai_scicite; No source-license research entry is present; manifest note: Legacy dataset script is disabled by current datasets versions; executable manifest uses HF converted parquet files.twenty_newsgroups_v001:needs_review_dataset_card_blank; No source-license research entry is present; manifest note: HF parquet metadata inspected on 2026-06-10 via refs/convert/parquet.
Intended Uses
- Local smoke testing of StrataBERT checkpoint loading, evaluation scripts, and metadata plumbing.
- Reproducibility checks for run_001 diagnostic artifacts.
Out-of-Scope Uses
- Public benchmark claims.
- Production classification or token-classification deployment.
- Commercial reuse of dataset-derived behavior without legal review of the relevant datasets.
Evaluation
| metric | value |
|---|---|
accuracy |
0.26 |
macro_f1 |
0.10317460317460318 |
weighted_f1 |
0.10730158730158731 |
loss |
1.3858718490600586 |
Evaluation artifact: checkpoints/run_001/tiny_ag_news_smoke.
Length-Bucketed Results
| bucket | support | accuracy |
|---|---|---|
0_512 |
200 | 0.26 |
Latency and Memory
| item | value |
|---|---|
| device | cpu |
| batch size | 2 |
| sequence length | 128 |
| p50 latency ms | 10.763351499917917 |
| p95 latency ms | 12.447670099209063 |
| latency 95% CI ms | 0.6102587742635365 |
| examples/sec | 180.17026675821398 |
| tokens/sec | 23061.79414505139 |
| OOM status | not_oom |
| max batch under memory cap | 2 |
Memory measurements are not release-grade in this diagnostic card unless explicitly listed above.
Hardware and Software
- Training/eval torch:
2.12.0+cu130 - CUDA available during checkpoint creation:
False - Latency environment:
{'cuda': '13.0', 'cuda_available': False, 'platform': 'Linux-6.14.0-37-generic-x86_64-with-glibc2.41', 'python': '3.12.13', 'torch': '2.12.0+cu130'} - Vast AI:
None
Known Limitations
- Random or tiny diagnostic training only; no release-quality pretraining.
- Mandatory ModernBERT, Ettin, DeBERTa-v3, Longformer, BigBird, and embedding baselines are still pending.
- Long-context 2k/4k/8k claims are unsupported by this card.
- Dataset license caveats remain unresolved for public claims.
Ethical and Privacy Considerations
This checkpoint is diagnostic and should not be deployed. Dataset provenance and privacy review are incomplete for release use, and token-classification public claims require a publication-safe dataset replacement or legal approval.
Reproducibility
- Training command:
scripts/finetune_classification.py --train-index data/eval_frozen/run_001/ag_news_train_index_sample64.json --train-split train --eval-index data/eval_frozen/run_001/ag_news_eval_index_sample200.json --eval-split test --max-train-examples 32 --max-eval-examples 64 --batch-size 8 --epochs 1 --max-length 96 --lr 5e-4 --seed 1337 --output runs/run_001/eval_reports/stratabert_tiny_ag_news_finetune_smoke.json --checkpoint-dir checkpoints/run_001/tiny_ag_news_smoke - Tokenizer:
{'source': 'answerdotai/ModernBERT-base', 'vocab_size': 50368} - Seed:
1337 - Checkpoint path:
checkpoints/run_001/tiny_ag_news_smoke/model.safetensors - Evaluation reports:
data/eval_frozen/run_001/ag_news_eval_index_sample200.json
Citation
Use CITATION.cff from this repository. Title: StrataBERT: A Padding-Safe SSM-Attention Encoder for Efficient Long-Document Classification.
Exact Git Commit
Commit: no_commit_yet. Dirty worktree at checkpoint creation: True.
- Downloads last month
- -