| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | - zh |
| | tags: |
| | - context compression |
| | - sentence selection |
| | - probing classifier |
| | - attention probing |
| | - RAG |
| | - LongBench |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # Sentinel Probing Classifier (Logistic Regression) |
| |
|
| | This repository contains the sentence-level classifier used in **Sentinel**, a lightweight context compression framework introduced in our paper: |
| |
|
| | > **Sentinel: Attention Probing of Proxy Models for LLM Context Compression with an Understanding Perspective** |
| | > Yong Zhang, Yanwen Huang, Ning Cheng, Yang Guo, Yun Zhu, Yanmeng Wang, Shaojun Wang, Jing Xiao |
| | > π [Paper (Arxiv 2025)](https://arxiv.org/abs/2505.23277)β|βπ» [Code on GitHub](https://github.com/yzhangchuck/Sentinel) |
| |
|
| | --- |
| |
|
| | ## π§ What is Sentinel? |
| |
|
| | **Sentinel** reframes LLM context compression as a lightweight attention-based *understanding* task. Instead of fine-tuning a full compression model, it: |
| |
|
| | - Extracts **decoder attention** from a small proxy LLM (e.g., Qwen-2.5-0.5B) |
| | - Computes **sentence-level attention features** |
| | - Applies a **logistic regression (LR) classifier** to select relevant sentences |
| |
|
| | This approach is efficient, model-agnostic, and highly interpretable. |
| |
|
| | --- |
| |
|
| | ## π¦ Files Included |
| |
|
| | | File | Description | |
| | |-------------------------|----------------------------------------------| |
| | | `sentinel_lr_model.pkl` | Trained logistic regression classifier | |
| | | `sentinel_config.json` | Feature extraction configuration | |
| |
|
| | --- |
| |
|
| | ## π Usage |
| |
|
| | Use this classifier on attention-derived feature vectors to predict sentence-level relevance scores: |
| |
|
| | π Feature extraction code and full pipeline available at: |
| | π https://github.com/yzhangchuck/Sentinel |
| |
|
| | ## π Benchmark Results |
| | <p align="center"> |
| | <img src="longbench_gpt35.png" alt="LongBench GPT-3.5 Results" width="750"/> |
| | </p> |
| |
|
| |
|
| | <p align="center"> |
| | <img src="longbench_qwen7b.png" alt="LongBench Qwen Results" width="750"/> |
| | </p> |
| |
|
| |
|
| | ## π Citation |
| | Please cite us if you use this model: |
| |
|
| | @misc{zhang2025sentinelattentionprobingproxy, |
| | title={Sentinel: Attention Probing of Proxy Models for LLM Context Compression with an Understanding Perspective}, |
| | author={Yong Zhang and Yanwen Huang and Ning Cheng and Yang Guo and Yun Zhu and Yanmeng Wang and Shaojun Wang and Jing Xiao}, |
| | year={2025}, |
| | eprint={2505.23277}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2505.23277}, |
| | } |
| | |
| | ## π¬ Contact |
| | β’ π§ zhangyong.chuck@gmail.com |
| | β’ π Project: https://github.com/yzhangchuck/Sentinel |
| | |
| |
|
| | ## π License |
| |
|
| | Apache License 2.0 β Free for research and commercial use with attribution. |