Submitted by
Shauli Ravfogel
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Can LLMs Introspect? A Reality Check
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors
None defined yet.
Can LLMs Introspect? A Reality Check
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors