YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Overflow Probe on PISCO representations

A binary MLP probe that detects token overflow in soft-compressed document representations PISCO. Token overflow occurs when a document's information content exceeds the capacity of the compressed token budget, leading to degraded downstream QA performance.

How It Works

The probe takes a 4096-dim vector:

Component Description
mid_q Last hidden representation from mid layer (16) of a PISCO decoder model with standard prompt, compressed context, and a question.

Output: probability that the compressed representation has overflowed (i.e., lost critical information).

Installation

pip install torch huggingface_hub

Usage

1. Get the class definition

The model requires the PISCOClassifier class to load. Grab it from this repo:

from huggingface_hub import hf_hub_download
import importlib.util, sys

path = hf_hub_download("wexumin/overflow_probe_pisco_squad", "pisco_clf.py")
spec = importlib.util.spec_from_file_location("pisco_clf", path)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
PISCOClassifier = mod.PISCOClassifier

2. Load the model

model = PISCOClassifier.from_pretrained("wexumin/overflow_probe_pisco_squad")

3. Run inference


# postproj: compressed doc embedding (4096-dim)
x = mid_q

probs = model.predict_proba(x)  # (n, ) β€” is overflow probability
preds = model.predict(x)        # (n,)   β€” binary 0/1 (one can provide custom threshold parameter)

Training Data

  • SQuAD β€” extractive QA over Wikipedia paragraphs

Each context in the dataset was reduced to just question-answering sentence and then filled with noise context to be up to 128 tokens (in terms of pisco encoder tokenzier).

Architecture

β†’ Linear(4096, 512)
β†’ LayerNorm
β†’ GELU
β†’ Dropout(0.3)
β†’ Linear(512, 128)
β†’ GELU
β†’ Dropout(0.2)
β†’ Linear(128, 1)

Citation

@inproceedings{belikova-etal-2026-detecting,
    title = "Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation",
    author = "Belikova, Julia  and Rozhevskii, Danila  and Svirin, Dennis  and Polev, Konstantin  and Panchenko, Alexander",
    editor = "Baez Santamaria, Selene  and Somayajula, Sai Ashish  and Yamaguchi, Atsuki",
    booktitle = "Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 4: Student Research Workshop)",
    month = mar,
    year = "2026",
    address = "Rabat, Morocco",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.eacl-srw.59/",
    pages = "797--810",
    ISBN = "979-8-89176-383-8"
}
Downloads last month
62
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for wexumin/overflow_probe_pisco_squad