Andrey's picture

In a Training Loop 🔄

Andrey

Bochkov

·

AI & ML interests

None yet

Recent Activity

updated a collection 1 day ago

Language Models Without a Trainable Input Embedding Table

updated a collection 1 day ago

Language Models Without a Trainable Input Embedding Table

updated a collection 1 day ago

Language Models Without a Trainable Input Embedding Table

View all activity

Organizations

None yet

updated a collection 1 day ago

Language Models Without a Trainable Input Embedding Table

This collection is provided for reproducibility of the paper's main claim • 3 items • Updated 1 day ago

updated a model 1 day ago

Bochkov/llm-fix-min-affine-recoded-minimal-code-table-free

Text Generation • 0.5B • Updated 1 day ago • 26

published a model 1 day ago

Bochkov/llm-fix-min-affine-recoded-minimal-code-table-free

Text Generation • 0.5B • Updated 1 day ago • 26

updated a model 1 day ago

Bochkov/llm-fix-min-fixed-minimal-binary-code

Text Generation • 0.5B • Updated 1 day ago • 32

published a model 1 day ago

Bochkov/llm-fix-min-fixed-minimal-binary-code

Text Generation • 0.5B • Updated 1 day ago • 32

updated a model 1 day ago

Bochkov/llm-fix-min-baseline-learned-input-table-model-classic

Text Generation • 0.5B • Updated 1 day ago • 34

published a model 1 day ago

Bochkov/llm-fix-min-baseline-learned-input-table-model-classic

Text Generation • 0.5B • Updated 1 day ago • 34

reacted to sergiopaniego's post with 🔥 4 months ago

Post

3112

New REPL environment in OpenEnv available! ✨
Used in the Recursive Language Models (RLM) paper by Alex Zhang.

Ready for inference & post-training using trajectories. Handles long contexts:

> Run Python code in a sandbox
> Make recursive calls to LMs
> Explore data programmatically
> Return final result

Docs: https://meta-pytorch.org/OpenEnv/environments/repl/
Inference script: https://github.com/meta-pytorch/OpenEnv/blob/main/examples/repl_oolong_simple.py

upvoted a paper 4 months ago

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 48

updated 9 models 4 months ago

Bochkov/growing-transformers-model-frozen-16-bit-baseline-monolyth-181m

Text Generation • Updated Jan 9 • 33

Bochkov/growing-transformers-model-unfrozen-baseline-monolyth-247m

Text Generation • Updated Jan 9 • 5

Bochkov/growing-transformers-model-unfrozen-1-9-247m

Text Generation • Updated Jan 9 • 3

Bochkov/growing-transformers-model-16-bit-1-9-181m

Text Generation • Updated Jan 9 • 4

Bochkov/growing-transformers-model-frozen-unicode-baseline-monolyth-247m

Text Generation • Updated Jan 9 • 1

Bochkov/emergent-semantics-model-uni-glyph-335m

Text Generation • Updated Jan 7 • 8

Bochkov/emergent-semantics-model-unfrozen-335m

Text Generation • Updated Jan 7 • 6

Bochkov/emergent-semantics-model-16-bit-269m

Text Generation • Updated Jan 7 • 10 • 1

Bochkov/emergent-semantics-model-64-bit-272m

Text Generation • Updated Jan 7 • 4