Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Junyan Li
senfu
AI & ML interests
None yet
Recent Activity
submitted a paper about 15 hours ago
FlowCompile: An Optimizing Compiler for Structured LLM Workflows updated a dataset 9 months ago
senfu/test published a dataset 9 months ago
senfu/testOrganizations
None yet