John Ho PRO
AI & ML interests
Recent Activity
Organizations
-
RunningFeatured232
PaddleOCR-VL Online Demo
π232Extract text, tables, formulas, and charts from images
-
Running on ZeroFeatured444
DeepSeek OCR Demo
π444An interactive demo for the DeepSeek-OCR model.
-
Running on ZeroFeatured94
LightOnOCR 2 1B Demo
π¨94Extract text and tables from images or PDFs
-
Running on ZeroMCPFeatured140
Multimodal OCR2
π»140nanonets ocr / smoldocling / monkey ocr / typhoon ocr
-
Build error51
Quant
π»51Display interactive data visualizations and apps
-
RunningFeatured45
Porting nanochat to Transformers: an AI modeling history lesson
π45Learn about ML and Transformers through nanochat
-
Running on CPU UpgradeFeatured2.99k
The Smol Training Playbook
π2.99kThe secrets to building world-class LLMs
-
Running on ZeroFeatured827
Florence 2
π827Generate captions, detections, and segmentations from images
-
Runtime errorFeatured515
Florence2 + SAM2
π₯515Segment and caption objects in images and videos
-
SleepingFeatured109
SAM2 Video Predictor
π₯109Segment objects in a video with clickβbased masks
-
Running22
SAM2 Video Predictor
π₯22Segment and track objects in videos
-
EvanZhouDev/open-genmoji
Text-to-Image β’ Updated β’ 76 β’ β’ 67 -
Running on ZeroFeatured641
ACE Step
π»641A Step Towards Music Generation Foundation Model
-
Running on ZeroFeatured600
DreamO
π¨600A Unified Framework for Image Customization
-
Running on ZeroFeatured972
Tile Upscaler
π972Enhance and upscale images with HDR and AI control
-
Configuration errorFeatured1.45k
EasyControl Ghibli
π¦1.45kNew Ghibli EasyControl model is now released!!
-
akiyamasho/AnimeBackgroundGAN-Miyazaki
Image-to-Image β’ Updated β’ 25 -
Runtime error72
Ghibli Multilingual Text-Rendering
π¦72Elevating Ghibli-style AI art beyond ChatGPT's capabilities.
-
Running on A100MCP44
EasyControl Ghibli
π¦44New Ghibli EasyControl model is now released!!
-
Running on ZeroFeatured61
LightGlue
β61LightGlue demo
-
Running on ZeroMCPFeatured32
Qwen3 VL HF Demo
π₯32object detection, visual grounding, keypoint detection
-
prithivMLmods/MetaCLIP-2-Age-Range-Estimator
Image Classification β’ 21.7M β’ Updated β’ 47 β’ 6 -
RunningFeatured734
Remove Background Web
πΌ734In-browser background removal
-
Running15
AI Video Editor
π15Create videos with FFMPEG + Qwen2.5-Coder
-
Searchium-ai/clip4clip-webvid150k
Text-to-Video β’ 0.2B β’ Updated β’ 3k β’ 43 -
RunningFeatured432
FastVLM WebGPU
π432Real-time video captioning powered by FastVLM
-
Runtime errorFeatured36
AudioRag Demo
π΅36Search audio for relevant chunks
-
Running on ZeroFeatured459
Parakeet-TDT-0.6b-V2
Β459Transcribe audio files with timestamps and export CSV/SRT
-
Running on Zero51
Fast Whisper Turbo
β‘51Ultra-fast Whisper Turbo inference β‘
-
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ Updated β’ 3.15M β’ β’ 2.82k -
Running on ZeroFeatured347
Realtime Whisper Turbo
π€―347Realtime implementation of Whisper large turbo
-
Running on T4121
RF-DETR
π₯121SOTA real-time object detection model
-
Running on CPU Upgrade50
YOLO ARENA
π50compare performance of top object detectors
-
Running22
SAM2 Video Predictor
π₯22Segment and track objects in videos
-
Running on ZeroFeatured113
VLM Object Understanding
π¦113Explore object detection, visual grounding, keypoint Detecti
-
Running on ZeroFeatured108
Qwen2 VL Localization
π108Detect objects in images using text prompts
-
Build errorFeatured160
Seed1.5 VL
π160Seed1.5-VL API Demo
-
Runtime error2
Vision Language SmolVLM2
π2Video + text to text with SmolVLM2
-
Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Generate text responses to images, videos, and audio
-
Runtime error9
Cantonese TTS Text To Speech
π9Generate Cantonese speech from text
-
Running4
Cantonese TTS Playground
π₯4Generate speech from Cantonese text using selected or custom voice
-
Running on ZeroFeatured1.75k
Dia 1.6B
π―1.75kGenerate realistic dialogue from a script, using Dia!
-
Runtime errorFeatured81
Daily Paper Podcast
π81Generates a podcast about today's top trending paper.
-
RunningFeatured232
PaddleOCR-VL Online Demo
π232Extract text, tables, formulas, and charts from images
-
Running on ZeroFeatured444
DeepSeek OCR Demo
π444An interactive demo for the DeepSeek-OCR model.
-
Running on ZeroFeatured94
LightOnOCR 2 1B Demo
π¨94Extract text and tables from images or PDFs
-
Running on ZeroMCPFeatured140
Multimodal OCR2
π»140nanonets ocr / smoldocling / monkey ocr / typhoon ocr
-
Running on ZeroFeatured61
LightGlue
β61LightGlue demo
-
Running on ZeroMCPFeatured32
Qwen3 VL HF Demo
π₯32object detection, visual grounding, keypoint detection
-
prithivMLmods/MetaCLIP-2-Age-Range-Estimator
Image Classification β’ 21.7M β’ Updated β’ 47 β’ 6 -
RunningFeatured734
Remove Background Web
πΌ734In-browser background removal
-
Build error51
Quant
π»51Display interactive data visualizations and apps
-
RunningFeatured45
Porting nanochat to Transformers: an AI modeling history lesson
π45Learn about ML and Transformers through nanochat
-
Running on CPU UpgradeFeatured2.99k
The Smol Training Playbook
π2.99kThe secrets to building world-class LLMs
-
Running15
AI Video Editor
π15Create videos with FFMPEG + Qwen2.5-Coder
-
Searchium-ai/clip4clip-webvid150k
Text-to-Video β’ 0.2B β’ Updated β’ 3k β’ 43 -
RunningFeatured432
FastVLM WebGPU
π432Real-time video captioning powered by FastVLM
-
Runtime errorFeatured36
AudioRag Demo
π΅36Search audio for relevant chunks
-
Running on ZeroFeatured459
Parakeet-TDT-0.6b-V2
Β459Transcribe audio files with timestamps and export CSV/SRT
-
Running on Zero51
Fast Whisper Turbo
β‘51Ultra-fast Whisper Turbo inference β‘
-
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ Updated β’ 3.15M β’ β’ 2.82k -
Running on ZeroFeatured347
Realtime Whisper Turbo
π€―347Realtime implementation of Whisper large turbo
-
Running on ZeroFeatured827
Florence 2
π827Generate captions, detections, and segmentations from images
-
Runtime errorFeatured515
Florence2 + SAM2
π₯515Segment and caption objects in images and videos
-
SleepingFeatured109
SAM2 Video Predictor
π₯109Segment objects in a video with clickβbased masks
-
Running22
SAM2 Video Predictor
π₯22Segment and track objects in videos
-
Running on T4121
RF-DETR
π₯121SOTA real-time object detection model
-
Running on CPU Upgrade50
YOLO ARENA
π50compare performance of top object detectors
-
Running22
SAM2 Video Predictor
π₯22Segment and track objects in videos
-
Running on ZeroFeatured113
VLM Object Understanding
π¦113Explore object detection, visual grounding, keypoint Detecti
-
Running on ZeroFeatured108
Qwen2 VL Localization
π108Detect objects in images using text prompts
-
Build errorFeatured160
Seed1.5 VL
π160Seed1.5-VL API Demo
-
Runtime error2
Vision Language SmolVLM2
π2Video + text to text with SmolVLM2
-
Running on ZeroFeatured142
Gemma 3n E4B It
β‘142Generate text responses to images, videos, and audio
-
EvanZhouDev/open-genmoji
Text-to-Image β’ Updated β’ 76 β’ β’ 67 -
Running on ZeroFeatured641
ACE Step
π»641A Step Towards Music Generation Foundation Model
-
Running on ZeroFeatured600
DreamO
π¨600A Unified Framework for Image Customization
-
Running on ZeroFeatured972
Tile Upscaler
π972Enhance and upscale images with HDR and AI control
-
Runtime error9
Cantonese TTS Text To Speech
π9Generate Cantonese speech from text
-
Running4
Cantonese TTS Playground
π₯4Generate speech from Cantonese text using selected or custom voice
-
Running on ZeroFeatured1.75k
Dia 1.6B
π―1.75kGenerate realistic dialogue from a script, using Dia!
-
Runtime errorFeatured81
Daily Paper Podcast
π81Generates a podcast about today's top trending paper.
-
Configuration errorFeatured1.45k
EasyControl Ghibli
π¦1.45kNew Ghibli EasyControl model is now released!!
-
akiyamasho/AnimeBackgroundGAN-Miyazaki
Image-to-Image β’ Updated β’ 25 -
Runtime error72
Ghibli Multilingual Text-Rendering
π¦72Elevating Ghibli-style AI art beyond ChatGPT's capabilities.
-
Running on A100MCP44
EasyControl Ghibli
π¦44New Ghibli EasyControl model is now released!!