KestrelNet / GoshawkNet — Benchmark Suite

Here's what a tiny model can do.

Five public datasets. Five domains. All under 164K parameters. All CPU-only. All pure NumPy — no PyTorch, no TensorFlow, no GPU. Every result verified on Kaggle with live scoring.

Results

Dataset	Domain	Task	Accuracy	F1 / AUC	Params	Size	Latency
MIT-BIH Arrhythmia	Cardiology	5-class ECG	97.2%	F1 0.853	12,756	50 KB	56 μs
EEG Brainwave Emotions	Neuroscience	3-class EEG	99.1%	F1 0.991	163,788	640 KB	1.3 ms
EEG Eye State	Neuroscience	Binary EEG	94.2%	AUC 0.986	1,576	6 KB	17 μs
Epileptic Seizure	Neurology	Binary EEG	97.1%	AUC 0.988	12,072	47 KB	—
HAR Smartphones	Wearables	6-class IMU	94.9%	F1 0.949	15,416	60 KB	70 μs

Total model storage for all five: 803 KB.

For context, a single layer of BERT is 7 million parameters. Our five models combined have 205,608.

How Small Is Small?

Dataset	Typical CNN/LSTM	Ours	How much smaller
ECG Heartbeat	500K – 2M params	12,756	40–160x
EEG Emotions	1M+ params	163,788	6x
EEG Eye State	100K+ params	1,576	63x
Seizure Detection	200K+ params	12,072	17x
HAR Smartphones	200K – 1M params	15,416	13–65x

Two Model Families

We ship two architectures, named after raptors — bird size matches model size, hunting style matches classification style.

KestrelNet (Standard FC)

The kestrel is the smallest falcon. It hovers perfectly still, then strikes with precision. KestrelNet is a standard fully-connected network with ReLU activations. Minimal parameters, maximum accuracy.

Input → Dense(hidden₁, ReLU) → Dense(hidden₂, ReLU) → Dense(classes, Softmax)

GoshawkNet (Multivector Products)

The goshawk is a larger raptor that hunts in complex terrain, reading patterns others miss. GoshawkNet replaces standard dot products with multivector products, giving each neuron native access to rotations, reflections, and scaling in a single operation. More parameters, but captures geometric structure in the data that FC nets need many more layers to approximate.

Best model per dataset:

Dataset	Best Model	Architecture
ECG Heartbeat	GoshawkNet Cl(0,2)	Quaternion, [16, 8] hidden
EEG Emotions	GoshawkNet Cl(0,2)	Quaternion, [16, 8] hidden
EEG Eye State	GoshawkNet Cl(0,2)	Quaternion, [16, 8] hidden
Seizure Detection	GoshawkNet Cl(0,2)	Quaternion, [16, 8] hidden
HAR Smartphones	GoshawkNet Cl(0,2)	Quaternion, [16, 8] hidden

Quaternion algebra (Cl(0,2), dimension 4) consistently wins across all five domains.

Per-Dataset Details

ECG Heartbeat — MIT-BIH Arrhythmia Database

Samples: 87,554 train / 21,892 test
Features: 187 time-series values per heartbeat
Classes: Normal (N), Supraventricular (S), Ventricular (V), Fusion (F), Unknown (Q)
Best model: GoshawkNet Cl(0,2) [16,8] — 97.2% accuracy, 12,756 params
Kaggle notebook: samareddy94/gnaninet-ecg-benchmark

Class	Accuracy
Normal (N)	99.2%
Supraventricular (S)	64.6%
Ventricular (V)	90.9%
Fusion (F)	63.0%
Unknown (Q)	95.9%

EEG Brainwave Emotions

Samples: 2,132 (1,707 train / 425 test)
Features: 2,548 EEG features (channel means + FFT)
Classes: Negative, Neutral, Positive
Best model: GoshawkNet Cl(0,2) [16,8] — 99.1% accuracy, 163,788 params
Kaggle notebook: samareddy94/99-eeg-emotion-detection-164k-params-no-gpu

Class	Accuracy
Negative	99.3%
Neutral	100.0%
Positive	97.9%

EEG Eye State — UCI / Roesler

Samples: 14,980 (11,985 train / 2,995 test)
Features: 14 EEG channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4)
Classes: Eyes Open, Eyes Closed
Best model: GoshawkNet Cl(0,2) [16,8] — 94.2% accuracy, 1,576 params
Kaggle notebook: samareddy94/gnaninet-eeg-eyestate-benchmark

The smallest model in the suite: 1,576 parameters, 6 KB. Runs at 60,000 inferences/sec on CPU.

Epileptic Seizure Recognition — Bonn University

Samples: 11,500 (9,200 train / 2,300 test)
Features: 178 EEG time-series values
Classes: Seizure vs Non-seizure (binary)
Best model: GoshawkNet Cl(0,2) [16,8] — 97.1% accuracy, AUC 0.988, 12,072 params

AUC of 0.988 means the model correctly ranks seizure vs non-seizure 98.8% of the time — critical for clinical screening.

HAR Smartphones — UCI Activity Recognition

Samples: 7,352 train / 2,947 test (official split)
Features: 228 triaxial accelerometer + gyroscope features
Classes: Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing, Laying
Best model: GoshawkNet Cl(0,2) [16,8] — 95.7% local / 94.9% Kaggle live, 15,416 params
Kaggle notebook: samareddy94/gnaninet-har-benchmark

Class	Accuracy
Walking	99.0%
Walking Upstairs	90.7%
Walking Downstairs	96.4%
Sitting	91.9%
Standing	95.7%
Laying	99.8%

Training Details

All models trained with the same configuration:

Optimizer: Adam (lr=0.001, β₁=0.9, β₂=0.999)
LR Schedule: Warmup-cosine (10-epoch warmup)
Early stopping: Patience 30–40 on validation loss
Batch size: 64–128
L2 regularization: λ = 1e-4 to 1e-5
Gradient clipping: 5.0
Normalization: Z-score, fit on training set only
Backpropagation: Analytic (hand-derived gradients, no autograd)

Training is fast — all five models train in under 10 minutes total on a laptop CPU.

Repository Structure

├── ecg-heartbeat/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 97.2% accuracy
│   └── results.json       # Full benchmark comparison (4 models)
├── eeg-emotions/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 99.1% accuracy
│   └── results.json
├── eye-state/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 94.2% accuracy
│   └── results.json
├── seizure-prediction/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 97.1% accuracy
│   └── results.json
├── har-smartphones/
│   ├── weights.txt        # GoshawkNet Cl(0,2) [16,8] — 94.9% accuracy
│   └── results.json
└── inference.py           # Self-contained inference loader (no dependencies beyond NumPy)

Quick Start

import numpy as np
from inference import load_model

# Load any model
model = load_model("ecg-heartbeat")
proba = model.predict_proba(np.random.randn(187))
print(proba)  # [0.92, 0.01, 0.05, 0.01, 0.01] — 5-class probabilities

Intended Use

Clinical screening: Pre-filter for ECG/EEG analysis before specialist review
Edge deployment: Wearables, IoT sensors, embedded devices — no GPU, no cloud
Ensemble first stage: Fast, tiny model screens easy cases; complex model handles the rest
Research baseline: Reproducible benchmarks on public datasets with minimal compute
Education: Complete from-scratch neural network with analytic gradients

Limitations

Models are trained on tabular/flattened features, not raw waveforms
Per-class accuracy varies — rare classes (ECG Fusion, ECG Supraventricular) have lower recall
No sequence modeling — each sample is classified independently
Medical models are NOT validated for clinical use — research benchmarks only

Kaggle Verification

All results except seizure prediction have been verified with live Kaggle notebook scoring:

Dataset	Kaggle Notebook
ECG Heartbeat	samareddy94/gnaninet-ecg-benchmark
EEG Emotions	samareddy94/99-eeg-emotion-detection-164k-params-no-gpu
EEG Eye State	samareddy94/gnaninet-eeg-eyestate-benchmark
HAR Smartphones	samareddy94/gnaninet-har-benchmark

Citation

@misc{kestrelnet-benchmarks-2026,
  title={KestrelNet/GoshawkNet: Tiny Neural Classifiers for Biosignal and Sensor Data},
  author={Sama Reddy},
  year={2026},
  url={https://huggingface.co/reddysama/kestrelnet-benchmarks}
}

No PyTorch. No TensorFlow. No GPU. Just NumPy and math.
Fraud Classifier · Live Demo · Website

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

Accuracy on MIT-BIH Arrhythmia
self-reported

0.972
Macro F1 on MIT-BIH Arrhythmia
self-reported

0.853
Accuracy on EEG Brainwave Emotions
self-reported

0.991
Macro F1 on EEG Brainwave Emotions
self-reported

0.991
Accuracy on EEG Eye State (UCI)
self-reported

0.942
AUC-ROC on EEG Eye State (UCI)
self-reported

0.986
Accuracy on Bonn University EEG
self-reported

0.971
AUC-ROC on Bonn University EEG
self-reported

0.988
Accuracy on UCI HAR Smartphones
self-reported

0.949
Macro F1 on UCI HAR Smartphones
self-reported

0.949