Center Business Solutions inc

company

http://www.cmsmanhattan.com

kgrabko

Activity Feed

AI & ML interests

I do LLM for my projects

Recent Activity

kgrabko updated a model 2 days ago

CMSManhattan/JiRack-Pro-Tokenizer-128K

kgrabko updated a model 2 days ago

CMSManhattan/JiRack-Ultra-Tokenizer-256K

kgrabko published a model 2 days ago

CMSManhattan/JiRack-Ultra-Tokenizer-256K

View all activity

Organization Card

Community About org cards

     ___   _  ____             _
    |_  | |_| |  _ \ __ _  ___| | __
      | | | | | |_) / _` |/ __| |/ /
  /\__/ / | | |  _ < (_| | (__|   <
  \____/  |_| |_| \_\__,_|\___|_|\_\

  Java Intelligent Rack System with PyTorch models with ML script and chatbots
  JiRack LLM by CMS Manhattan with Models open source code
 
  Web services with OnnxRuntime Launch Inference images with Open AI and Ollama REST AP support in docker hub.
  OnnxRuntime suppots to run models on many GPU cards and Data Centers
  Docker repo : https://hub.docker.com/u/cmsmanhattan

🔥 CMSManhattan - Frontier Ternary Neural Networks

Creating the world's first 405B parameter ternary model

🎯 Mission

Democratizing access to massive language models through extreme efficiency. Training state-of-the-art LLMs on accessible hardware using 1.58-bit (ternary) precision.

🏆 JiRackTernary Series

A family of efficient large language models based on BitNet architecture with ternary weights {-1, 0, 1}.

🌐 Public Models

Model	Parameters	Size	Status	Link
JiRackTernary_1b	1B	~350MB	✅ Released	Download
JiRackTernary_10b	10B	~3GB	✅ Released	Download

🔒 Private Models (In Training)

Model	Parameters	Size	Status	ETA
JiRackTernary_70b	70B	~25GB	🚧 Training (Step 15,600+)	Q2 2026
JiRackTernary_405b	405B	~115GB	🔥 WORLD'S FIRST 405B TERNARY	Q3 2026

⚡ Key Innovations

7x Compression

Traditional LLaMA-3 70B:  ~140 GB (FP16)
JiRackTernary 70B:        ~25 GB  (1.58-bit)
Compression ratio:        7x smaller! 🔥

Accessible Training

70B trained on single A100 80GB (Colab Pro+ - $50/month)
405B trained on single H200 141GB (Colab Enterprise)
Novel layer-by-layer training approach
No supercomputer clusters required!

Production-Ready Architecture

LLaMA-based with BitLinear layers
Ultra-lean memory offloading
4-in-1 weight packing
Optimized for inference speed

🔬 Technical Highlights

Architecture Details

- Base: LLaMA-3 architecture
- Precision: 1.58-bit ternary weights {-1, 0, 1}
- Layers: Custom JiRackBitLinear with weight packing
- Normalization: RMSNorm
- Training: Layer-by-layer with gradient accumulation

Training Infrastructure

70B Model:
├── Hardware: A100 80GB (Colab Pro+)
├── Method: Layer-by-layer training
├── Batch size: 1 (micro)
├── Sequence length: 768 tokens
└── Cost: ~$50/month

405B Model:
├── Hardware: H200 141GB (Colab Enterprise)
├── Method: Advanced layer-by-layer
├── Optimized for massive scale
└── World's first 405B ternary model 🏆

📊 Performance

Model Comparison

Model	Size	Precision	Memory	Training Cost
LLaMA-3 70B	140GB	FP16	Massive cluster	$$$$$$
LLaMA-3 70B (4-bit)	35GB	4-bit	2-4x A100	N/A (PTQ)
JiRackTernary 70B	25GB	1.58-bit	1x A100	$150-200

Current Training Status (Updated: 2026-02-09)

70B Model:

Step: 15,600+
Loss: ~7-9
PPL: ~3,000-5,000
Status: Early training phase (target: 100k+ steps)

405B Model:

Status: Active training on H200
Target: World's first converged 405B ternary model
Timeline: Q3 2026 estimated completion

🎓 Research & Publications

Upcoming

📝 Technical Paper (In Preparation)

Title: "JiRackTernary-405B: Scaling Ternary Neural Networks to 405 Billion Parameters"
Target: NeurIPS 2026 / ICML 2027

📊 Benchmark Suite

MMLU, HellaSwag, HumanEval
Comparison with LLaMA-3, Mixtral, DeepSeek
Efficiency metrics (inference speed, memory)

🎤 Open Source Release

Training code & documentation
Layer-by-layer methodology
Reproducibility guidelines

🚀 Why This Matters

For Researchers

✅ Train massive models without supercomputers
✅ Reproduce frontier research on Colab
✅ Enable new compression research directions

For Industry

✅ Deploy 405B-class models on fewer GPUs
✅ Faster inference with ternary operations
✅ Lower hosting costs (7x smaller)

For Community

✅ Democratization of large language models
✅ Accessible AI for everyone
✅ Open research methodology

📖 Learn More

Blog Posts (Coming Soon)

🔧 "Training 70B on a Single A100: Our Layer-by-Layer Approach"
📊 "Ternary Weights at Scale: Lessons from 15,000 Steps"
🚀 "Road to 405B: The Journey to World's First Ternary Mega-Model"

Technical Documentation

📚 Architecture deep-dive
🛠️ Training methodology
💻 Code examples & tutorials

🤝 Community

Get Involved

⭐ Star our models on HuggingFace
💬 Join discussions in model repos
🐛 Report issues or suggestions
📧 Contact: [Your contact method]

Citation

@misc{jiracternary2026,
  author = {CMSManhattan (kgrabko)},
  title = {JiRackTernary: Scaling Ternary Neural Networks to 405 Billion Parameters},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/CMSManhattan}
}

🏆 Recognition

World's First 405B Ternary Model 🔥
Proving that massive language models can be trained efficiently on accessible hardware

📊 Follow Progress

Track our journey:

🔄 Regular updates in model repos
📈 Training metrics & visualizations
🎯 Milestone announcements
🎓 Research publications

Making AI accessible, one ternary weight at a time. ✨

Last updated: 2026-02-09

PHONE 516-777-0945
Demo JiRack LLM and CMS Manhattan RAG System
https://www.youtube.com/watch?v=M4Q8_Dr35Cc
Download RAG System
git clone https://grabko1@bitbucket.org/cmsmanhattan/rag.git
https://bitbucket.org/cmsmanhattan/rag
Deployment kit on Docker or Kubernetes with API Gateway and Service Discovery by Microservice Architecture https://www.youtube.com/watch?v=M4Q8_Dr35Cc
Deployment script for RAG System: https://bitbucket.org/cmsmanhattan/rag

Trademark infomation : https://uspto.report/TM/90579072

SERIAL NUMBER 90579072

spaces 2

AutoTrain Advanced

🚀

Create powerful AI models without code

models 30

datasets 2

CMSManhattan/JiRack-Pretrain-Dataset

Viewer • Updated 16 days ago • 95.4M • 70 • 1

CMSManhattan/JiRackGPT3Micro

Updated Dec 15, 2025 • 11