Slipstream - gemma-4-E2B (EVM forecasting agent)

A small code-action agent that forecasts a project's final cost (EAC) and finish period from a mid-flight Earned Value Management snapshot. It is google/gemma-4-E2B-it (E2B (~2B effective), Gemma-4 (text decoder)) fine-tuned (LoRA, then merged) to run a single-tool reasoning loop: it writes Python that calls a curated forecasting toolkit (Earned Schedule, CPI/SPI formulas, a Gompertz growth curve, a reference-class ML regressor, and the TimesFM / Chronos time-series foundation models), reconciles their disagreeing estimates, and submits one answer.

It was distilled from a DeepSeek V4 teacher: the teacher's reasoning traces over a diverse simulated project corpus were filtered to a 367-trace SFT set (build-small-hackathon/slipstream-evm-sft) and the student trained with assistant-only loss (reasoning + tool-call tokens only). This makes a sub-5B, edge / air-gapped forecaster that matches the classical project-controls baseline and approaches its cloud teacher.

Results (held-out real projects, 40% complete, n=107)

Scored on 107 real completed projects (Batselier/OR-AS DSLIB), apples-to-apples with every baseline. valid = produced a usable forecast; EAC error = median absolute % error on final cost; finish error = median absolute error in periods.

Method valid EAC error finish error
gemma-4-E2B (this model, distilled) 0.991 2.31% 0.63 periods
gemma-4-E2B (base, before distillation) 0.664 3.21% 0.75 periods
Earned Schedule (classical baseline) 1 2.37% 1 periods
DeepSeek V4 teacher (cloud) 1 2.4% 0.6 periods

Distillation lifts a base model that could barely operate the tool-call format into a reliable forecaster that rivals the classical canon and its own teacher.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "build-small-hackathon/slipstream-gemma4-e2b-evm"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, torch_dtype="bfloat16")

The model is trained to act through a single run_python(code=...) tool call and to call submit(finish, eac) from inside that code. See the Slipstream project for the agent loop, the forecasting toolkit, and the full benchmark.

Licence and attribution

This is a derivative of google/gemma-4-E2B-it and is released under the base model's licence (gemma). You must comply with the upstream terms. Training data: build-small-hackathon/slipstream-evm-sft.

Built for the Hugging Face Build Small hackathon.

Downloads last month
7
Safetensors
Model size
5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for build-small-hackathon/slipstream-gemma4-e2b-evm

Adapter
(106)
this model

Dataset used to train build-small-hackathon/slipstream-gemma4-e2b-evm