Hello_zjt's picture

Hello_zjt

hellozjt

·

AI & ML interests

None yet

Organizations

None yet

upvoted an article 4 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30, 2025

•

299

upvoted a collection 8 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 562

upvoted 2 collections about 1 year ago

Open LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 50 items • Updated Mar 13 • 680

The Big Benchmarks Collection

Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 264

upvoted 2 papers about 1 year ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 447

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 82

upvoted 2 collections over 1 year ago

Llama 3.1 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Dec 6, 2024 • 22

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 710