Yunzhi Yao
cowTodd
AI & ML interests
None yet
Recent Activity
authored
a paper
about 2 hours ago
Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics
upvoted
a
paper
about 3 hours ago
Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics
authored
a paper
12 days ago
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency