BakeLab

company

Verified

https://bakeai.inc/research/

AI & ML interests

Open Research from Bake AI

Recent Activity

zhangchenxu authored a paper about 9 hours ago

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

zhangchenxu authored a paper about 9 hours ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

zhangchenxu authored a paper about 9 hours ago

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

View all activity

authored 7 papers about 9 hours ago

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

Paper • 2505.21605 • Published May 27, 2025

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Paper • 2510.09781 • Published Oct 10, 2025 • 27

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

Paper • 2512.06688 • Published Dec 7, 2025 • 2

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

Paper • 2603.27771 • Published Mar 29 • 52

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Paper • 2605.12684 • Published May 12 • 11

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

Paper • 2606.05080 • Published 15 days ago • 30

Steering Multimodal Large Language Models Decoding for Context-Aware Safety

Paper • 2509.19212 • Published Sep 23, 2025

authored 2 papers about 1 month ago

BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?

Paper • 2510.18003 • Published Oct 20, 2025

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Paper • 2605.12684 • Published May 12 • 11

updated a model about 1 month ago

BakeLab/Kallisti-35B-A3B

Image-Text-to-Text • 665k • Updated May 15 • 20 • 2

updated a dataset about 1 month ago

BakeLab/Visual-Aesthetic-Benchmark

Viewer • Updated May 15 • 400 • 92 • 7

published a dataset 4 months ago

BakeLab/Visual-Aesthetic-Benchmark

Viewer • Updated May 15 • 400 • 92 • 7

authored 2 papers 8 months ago

XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making

Paper • 2311.08614 • Published Nov 15, 2023

CoDA: Agentic Systems for Collaborative Data Visualization

Paper • 2510.03194 • Published Oct 3, 2025 • 31

authored a paper 9 months ago

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published Oct 1, 2025 • 29

authored a paper about 1 year ago

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

Paper • 2505.23977 • Published May 29, 2025 • 10

authored 2 papers about 1 year ago

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

Paper • 2505.23977 • Published May 29, 2025 • 10

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

Paper • 2505.14625 • Published May 20, 2025 • 13

authored 2 papers over 1 year ago

SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities

Paper • 2502.12025 • Published Feb 17, 2025 • 3

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published Mar 4, 2025 • 34