Locke (Locke Li)

updated a model about 2 months ago

meituan-longcat/LongCat-Next

Any-to-Any • 74B • Updated Apr 10 • 1.15k • 179

liked a model about 2 months ago

tencent/Hy-MT1.5-1.8B-1.25bit-GGUF

Translation • 2B • Updated 22 days ago • 632 • 19

liked a model 3 months ago

google/gemma-4-31B

Image-Text-to-Text • 33B • Updated 23 days ago • 513k • 430

New activity in meituan-longcat/LongCat-Next 3 months ago

LongCat-Next does not emit opening tag during tool calling

2

#4 opened 3 months ago by

kernelpool

repeat input_ids.size(1) in dim1

1

#3 opened 3 months ago by

kangyang

upvoted a paper 3 months ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published Mar 29 • 148

liked a model 3 months ago

meituan-longcat/LongCat-Next

Any-to-Any • 74B • Updated Apr 10 • 1.15k • 179

published a model 3 months ago

meituan-longcat/LongCat-Next

Any-to-Any • 74B • Updated Apr 10 • 1.15k • 179

upvoted a paper 5 months ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 105

commented on Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation 9 months ago

Could you provide some reference code?
Using the trainer, I'm confused by the dataloader and DistributedSampler.
Different ranks in the same sp_group always fail to obtain the same data idx es.

New activity in baichuan-inc/Baichuan-Audio-Instruct 12 months ago

ImportError (vector_quantize) when loading the model

3

#2 opened about 1 year ago by

Jeronymous

liked a model about 1 year ago

deepseek-ai/DeepSeek-Prover-V2-671B

Text Generation • 685B • Updated Apr 30, 2025 • 683 • 831

commented a paper about 1 year ago

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published Mar 26, 2025 • 4 •

3

upvoted a paper over 1 year ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 166

liked a dataset about 2 years ago

HuggingFaceFW/fineweb

Viewer • Updated Jul 11, 2025 • 52.5B • 295k • 2.9k

Locke Li

AI & ML interests

Recent Activity

Organizations

meituan-longcat/LongCat-Next

tencent/Hy-MT1.5-1.8B-1.25bit-GGUF

google/gemma-4-31B

LongCat-Next does not emit opening tag during tool calling

repeat input_ids.size(1) in dim1

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

meituan-longcat/LongCat-Next

meituan-longcat/LongCat-Next

Scaling Embeddings Outperforms Scaling Experts in Language Models

ImportError (vector_quantize) when loading the model

deepseek-ai/DeepSeek-Prover-V2-671B

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

HuggingFaceFW/fineweb

Locke Li

AI & ML interests

Recent Activity

Organizations

Locke's activity

LongCat-Next does not emit opening tag during tool calling

repeat input_ids.size(1) in dim1

ImportError (vector_quantize) when loading the model