Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
36773.5
TFLOPS
44
4
53
Sk md saad amin
Reality123b
Follow
ducla's profile picture
SarawinKumar's profile picture
arady7's profile picture
42 followers
Β·
63 following
AI & ML interests
None yet
Recent Activity
replied
to
Janady07
's
post
about 3 hours ago
Here is one of the equations that make up the worlds first Artificial General Intelligence. Remember when building Artificial Intelligence or anything on a device it all starts out binary. Everything starts out with data flow physics and mathmatics
updated
a Space
about 6 hours ago
lap-quantum/README
reacted
to
mrs83
's
post
with π₯
1 day ago
In 2017, my RNNs were babbling. Today, they are hallucinating beautifully. 10 years ago, getting an LSTM to output coherent English was a struggle. 10 years later, after a "cure" based on FineWeb-EDU and a custom synthetic mix for causal conversation, the results are fascinating. We trained this on ~10B tokens on a single AMD GPU (ROCm). It is not a Transformer: Echo-DSRN (400M) is a novel recurrent architecture inspired by Hymba, RWKV, and xLSTM, designed to challenge the "Attention is All You Need" monopoly on the Edge. The ambitious goal is to build a small instruct model with RAG and tool usage capabilities (https://huggingface.co/ethicalabs/Kurtis-EON1) π The Benchmarks (Size: 400M) For a model this size (trained on <10B tokens), the specialized performance is surprising: *SciQ*: 73.8% π¦ (This rivals billion-parameter models in pure fact retrieval). *PIQA*: 62.3% (Solid physical intuition for a sub-1B model). The Reality Check: HellaSwag (29.3%) and Winogrande (50.2%) show the limits of 400M parameters and 10B tokens training. We are hitting the "Reasoning Wall" which confirms we need to scale to (hopefully) unlock deeper common sense. As you can see in the visualization (to be released soon on HF), the FineWeb-EDU bias is strong. The model is convinced it is in a classroom ("In this course, we explore..."). The Instruct Model is not ready yet and we are currently using curriculum learning to test model plasticity. Source code and weights will not be released yet. This is not a fork or a fine-tune: the base model is built in-house at https://www.ethicalabs.ai/, with novel components that do not exist in current open libraries. π€ Call for Collaboration: I am looking for Peer Reviewers interested in recurrent/hybrid architectures. If you want to explore what lies beyond Transformers, letβs connect! Training diary: https://huggingface.co/ethicalabs/Kurtis-EON1
View all activity
Organizations
Reality123b
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
agents-course/notebooks
1 day ago
404 Not Found when using Qwen models with HuggingFaceInferenceAPI
π
1
2
#130 opened about 2 months ago by
Milkfish033
New activity in
mcp-course/unit_1_quiz
3 days ago
Geschafft π
4
#168 opened 6 days ago by
sky-meilin
New activity in
SYNAPTYX0AI/README
5 days ago
Update README.md
#1 opened 5 days ago by
Reality123b
New activity in
blog-explorers/README
12 days ago
[Support] Community Articles
π€
π
1
103
#5 opened almost 2 years ago by
victor
New activity in
DataMuncher-Labs/README
15 days ago
Quantum Computing
49
#2 opened 28 days ago by
Reality123b
New activity in
DataMuncher-Labs/UltraMath-Reasoning-Small
about 1 month ago
Formatting
7
#1 opened about 1 month ago by
Roman190928
New activity in
Roman190928/NUM32
about 1 month ago
...
1
#1 opened about 1 month ago by
Reality123b
New activity in
DataMuncher-Labs/UltraMath-Reasoning-Small
about 1 month ago
Update README.md
1
#2 opened about 1 month ago by
Roman190928
New activity in
DataMuncher-Labs/TrainingTime
about 1 month ago
Update app.py
1
#1 opened about 1 month ago by
Reality123b
New activity in
Lap1official/Math
about 1 month ago
Librarian Bot: Add language metadata for dataset
#2 opened about 1 year ago by
librarian-bot
New activity in
DataMuncher-Labs/README
about 1 month ago
Need data for a new model
13
#1 opened about 1 month ago by
Reality123b
New activity in
Roman190928/MicroGPT
about 1 month ago
Some kind of advice for a v2
2
#2 opened about 1 month ago by
Reality123b
New activity in
SmallDoge/Doge-20M-Instruct
about 2 months ago
Error
#2 opened about 2 months ago by
Reality123b
New activity in
huggingface/InferenceSupport
4 months ago
sdobson/nanochat
#5559 opened 4 months ago by
Reality123b
New activity in
smolagents/SmolVLM2-2.2B-Instruct-Agentic-GUI
4 months ago
can anyone tell me how i can use this model for computer use tasks?
1
#1 opened 4 months ago by
Reality123b
New activity in
huggingface/InferenceSupport
4 months ago
smolagents/SmolVLM2-2.2B-Instruct-Agentic-GUI
#5294 opened 4 months ago by
Reality123b
New activity in
huggingface/InferenceSupport
10 months ago
HuggingFaceTB/SmolVLM2-2.2B-Instruct
β
10
1
#1098 opened 10 months ago by
Reality123b
ds4sd/SmolDocling-256M-preview
π
π€
36
3
#69 opened 11 months ago by
Reality123b
ds4sd/SmolDocling-256M-preview
π
π€
36
3
#69 opened 11 months ago by
Reality123b
Load more