AI & ML interests

A new approach of running LLM/LMs' inference/training on GPU/NPU backends through C++ implementation and compile for High-Performance and Easy-to-Use

Recent Activity

wxthon  updated a model 2 days ago
refinefuture-ai/Qwen3-Smoke
wxthon  published a model 2 days ago
refinefuture-ai/Qwen3-Smoke
wxthon  updated a model 20 days ago
refinefuture-ai/Qwen3-Lite-3B-0.9B
View all activity