KestrelNet / GoshawkNet β Benchmark Suite
Here's what a tiny model can do.
Five public datasets. Five domains. All under 164K parameters. All CPU-only. All pure NumPy β no PyTorch, no TensorFlow, no GPU. Every result verified on Kaggle with live scoring.
Results
| Dataset | Domain | Task | Accuracy | F1 / AUC | Params | Size | Latency |
|---|---|---|---|---|---|---|---|
| MIT-BIH Arrhythmia | Cardiology | 5-class ECG | 97.2% | F1 0.853 | 12,756 | 50 KB | 56 ΞΌs |
| EEG Brainwave Emotions | Neuroscience | 3-class EEG | 99.1% | F1 0.991 | 163,788 | 640 KB | 1.3 ms |
| EEG Eye State | Neuroscience | Binary EEG | 94.2% | AUC 0.986 | 1,576 | 6 KB | 17 ΞΌs |
| Epileptic Seizure | Neurology | Binary EEG | 97.1% | AUC 0.988 | 12,072 | 47 KB | β |
| HAR Smartphones | Wearables | 6-class IMU | 94.9% | F1 0.949 | 15,416 | 60 KB | 70 ΞΌs |
Total model storage for all five: 803 KB.
For context, a single layer of BERT is 7 million parameters. Our five models combined have 205,608.
How Small Is Small?
| Dataset | Typical CNN/LSTM | Ours | How much smaller |
|---|---|---|---|
| ECG Heartbeat | 500K β 2M params | 12,756 | 40β160x |
| EEG Emotions | 1M+ params | 163,788 | 6x |
| EEG Eye State | 100K+ params | 1,576 | 63x |
| Seizure Detection | 200K+ params | 12,072 | 17x |
| HAR Smartphones | 200K β 1M params | 15,416 | 13β65x |
Two Model Families
We ship two architectures, named after raptors β bird size matches model size, hunting style matches classification style.
KestrelNet (Standard FC)
The kestrel is the smallest falcon. It hovers perfectly still, then strikes with precision. KestrelNet is a standard fully-connected network with ReLU activations. Minimal parameters, maximum accuracy.
Input β Dense(hiddenβ, ReLU) β Dense(hiddenβ, ReLU) β Dense(classes, Softmax)
GoshawkNet (Multivector Products)
The goshawk is a larger raptor that hunts in complex terrain, reading patterns others miss. GoshawkNet replaces standard dot products with multivector products, giving each neuron native access to rotations, reflections, and scaling in a single operation. More parameters, but captures geometric structure in the data that FC nets need many more layers to approximate.
Best model per dataset:
| Dataset | Best Model | Architecture |
|---|---|---|
| ECG Heartbeat | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| EEG Emotions | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| EEG Eye State | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| Seizure Detection | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
| HAR Smartphones | GoshawkNet Cl(0,2) | Quaternion, [16, 8] hidden |
Quaternion algebra (Cl(0,2), dimension 4) consistently wins across all five domains.
Per-Dataset Details
ECG Heartbeat β MIT-BIH Arrhythmia Database
- Samples: 87,554 train / 21,892 test
- Features: 187 time-series values per heartbeat
- Classes: Normal (N), Supraventricular (S), Ventricular (V), Fusion (F), Unknown (Q)
- Best model: GoshawkNet Cl(0,2) [16,8] β 97.2% accuracy, 12,756 params
- Kaggle notebook: samareddy94/gnaninet-ecg-benchmark
| Class | Accuracy |
|---|---|
| Normal (N) | 99.2% |
| Supraventricular (S) | 64.6% |
| Ventricular (V) | 90.9% |
| Fusion (F) | 63.0% |
| Unknown (Q) | 95.9% |
EEG Brainwave Emotions
- Samples: 2,132 (1,707 train / 425 test)
- Features: 2,548 EEG features (channel means + FFT)
- Classes: Negative, Neutral, Positive
- Best model: GoshawkNet Cl(0,2) [16,8] β 99.1% accuracy, 163,788 params
- Kaggle notebook: samareddy94/99-eeg-emotion-detection-164k-params-no-gpu
| Class | Accuracy |
|---|---|
| Negative | 99.3% |
| Neutral | 100.0% |
| Positive | 97.9% |
EEG Eye State β UCI / Roesler
- Samples: 14,980 (11,985 train / 2,995 test)
- Features: 14 EEG channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4)
- Classes: Eyes Open, Eyes Closed
- Best model: GoshawkNet Cl(0,2) [16,8] β 94.2% accuracy, 1,576 params
- Kaggle notebook: samareddy94/gnaninet-eeg-eyestate-benchmark
The smallest model in the suite: 1,576 parameters, 6 KB. Runs at 60,000 inferences/sec on CPU.
Epileptic Seizure Recognition β Bonn University
- Samples: 11,500 (9,200 train / 2,300 test)
- Features: 178 EEG time-series values
- Classes: Seizure vs Non-seizure (binary)
- Best model: GoshawkNet Cl(0,2) [16,8] β 97.1% accuracy, AUC 0.988, 12,072 params
AUC of 0.988 means the model correctly ranks seizure vs non-seizure 98.8% of the time β critical for clinical screening.
HAR Smartphones β UCI Activity Recognition
- Samples: 7,352 train / 2,947 test (official split)
- Features: 228 triaxial accelerometer + gyroscope features
- Classes: Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing, Laying
- Best model: GoshawkNet Cl(0,2) [16,8] β 95.7% local / 94.9% Kaggle live, 15,416 params
- Kaggle notebook: samareddy94/gnaninet-har-benchmark
| Class | Accuracy |
|---|---|
| Walking | 99.0% |
| Walking Upstairs | 90.7% |
| Walking Downstairs | 96.4% |
| Sitting | 91.9% |
| Standing | 95.7% |
| Laying | 99.8% |
Training Details
All models trained with the same configuration:
- Optimizer: Adam (lr=0.001, Ξ²β=0.9, Ξ²β=0.999)
- LR Schedule: Warmup-cosine (10-epoch warmup)
- Early stopping: Patience 30β40 on validation loss
- Batch size: 64β128
- L2 regularization: Ξ» = 1e-4 to 1e-5
- Gradient clipping: 5.0
- Normalization: Z-score, fit on training set only
- Backpropagation: Analytic (hand-derived gradients, no autograd)
Training is fast β all five models train in under 10 minutes total on a laptop CPU.
Repository Structure
βββ ecg-heartbeat/
β βββ weights.txt # GoshawkNet Cl(0,2) [16,8] β 97.2% accuracy
β βββ results.json # Full benchmark comparison (4 models)
βββ eeg-emotions/
β βββ weights.txt # GoshawkNet Cl(0,2) [16,8] β 99.1% accuracy
β βββ results.json
βββ eye-state/
β βββ weights.txt # GoshawkNet Cl(0,2) [16,8] β 94.2% accuracy
β βββ results.json
βββ seizure-prediction/
β βββ weights.txt # GoshawkNet Cl(0,2) [16,8] β 97.1% accuracy
β βββ results.json
βββ har-smartphones/
β βββ weights.txt # GoshawkNet Cl(0,2) [16,8] β 94.9% accuracy
β βββ results.json
βββ inference.py # Self-contained inference loader (no dependencies beyond NumPy)
Quick Start
import numpy as np
from inference import load_model
# Load any model
model = load_model("ecg-heartbeat")
proba = model.predict_proba(np.random.randn(187))
print(proba) # [0.92, 0.01, 0.05, 0.01, 0.01] β 5-class probabilities
Intended Use
- Clinical screening: Pre-filter for ECG/EEG analysis before specialist review
- Edge deployment: Wearables, IoT sensors, embedded devices β no GPU, no cloud
- Ensemble first stage: Fast, tiny model screens easy cases; complex model handles the rest
- Research baseline: Reproducible benchmarks on public datasets with minimal compute
- Education: Complete from-scratch neural network with analytic gradients
Limitations
- Models are trained on tabular/flattened features, not raw waveforms
- Per-class accuracy varies β rare classes (ECG Fusion, ECG Supraventricular) have lower recall
- No sequence modeling β each sample is classified independently
- Medical models are NOT validated for clinical use β research benchmarks only
Kaggle Verification
All results except seizure prediction have been verified with live Kaggle notebook scoring:
| Dataset | Kaggle Notebook |
|---|---|
| ECG Heartbeat | samareddy94/gnaninet-ecg-benchmark |
| EEG Emotions | samareddy94/99-eeg-emotion-detection-164k-params-no-gpu |
| EEG Eye State | samareddy94/gnaninet-eeg-eyestate-benchmark |
| HAR Smartphones | samareddy94/gnaninet-har-benchmark |
Citation
@misc{kestrelnet-benchmarks-2026,
title={KestrelNet/GoshawkNet: Tiny Neural Classifiers for Biosignal and Sensor Data},
author={Sama Reddy},
year={2026},
url={https://huggingface.co/reddysama/kestrelnet-benchmarks}
}
No PyTorch. No TensorFlow. No GPU. Just NumPy and math.
Fraud Classifier Β·
Live Demo Β·
Website
Evaluation results
- Accuracy on MIT-BIH Arrhythmiaself-reported0.972
- Macro F1 on MIT-BIH Arrhythmiaself-reported0.853
- Accuracy on EEG Brainwave Emotionsself-reported0.991
- Macro F1 on EEG Brainwave Emotionsself-reported0.991
- Accuracy on EEG Eye State (UCI)self-reported0.942
- AUC-ROC on EEG Eye State (UCI)self-reported0.986
- Accuracy on Bonn University EEGself-reported0.971
- AUC-ROC on Bonn University EEGself-reported0.988
- Accuracy on UCI HAR Smartphonesself-reported0.949
- Macro F1 on UCI HAR Smartphonesself-reported0.949