Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Sitong CHENG's picture
7 13

Sitong CHENG

cmots
ishine2010's profile picture
·

AI & ML interests

None yet

Organizations

None yet

upvoted a paper 2 months ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 65
upvoted a paper 3 months ago

Step-Audio-R1 Technical Report

Paper • 2511.15848 • Published Nov 19, 2025 • 58
upvoted a paper 4 months ago

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

Paper • 2510.09606 • Published Oct 10, 2025 • 18
upvoted a paper 5 months ago

UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice

Paper • 2509.21144 • Published Sep 25, 2025 • 1
upvoted a paper 12 months ago

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published Feb 23, 2025 • 36
upvoted a paper over 1 year ago

VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

Paper • 2406.05370 • Published Jun 8, 2024 • 17
upvoted a paper almost 2 years ago

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

Paper • 2405.00233 • Published Apr 30, 2024 • 17
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs